The open-source answer to big data

29.05.2012

These data scientists, moreover, work better with open-source platforms. Imran Ahmad is a data scientist who has developed his own grid-computing algorithm, a Hadoop competitor called Bileg, which is based on the open-source Globus toolkit (GT4). The president of Cloudanum Inc., a Toronto-based company that develops data analysis technologies for cloud environments, he says the fundamental advantage in an open-source platform is that people like him can see its underlying mathematical basis.

"If it's in open-source, you can dig down and see why I'm getting these results, why these results are the optimal ones," Ahamad says.

Proprietary data analytics software will work reasonably well most of the time, he adds. But it's when an "unusual scenario" comes up that you won't be able to trust your results. "They'll be way off from what you're looking for," he says. "And that is a really scary situation".

Not surprisingly, the most brilliant minds with backgrounds in statistical modeling are also in the highest demand, especially since organizations in other sectors, like financial institutions, are scooping them up.