Hadoop spinoff CEO: Use Apache's version

The CEO of a new Yahoo spinoff dedicated to developing and promoting the popular Apache distributed computing platform urged adoption Wednesday of Apache's Hadoop distribution. But Cloudera, also a player in the Hadoop space, is sticking by its own Hadoop platform.

A day after , a joint venture with Benchmark Capital to further develop Hadoop, Hortonworks CEO Eric Baldeschwieler stressed commitment to Apache Hadoop. "We're just asking everyone to commit to basing their offerings on the Apache Hadoop offering, and anybody who does that is our partner," Baldeschwieler said in a presentation at the Hadoop Summit 2011 event in Silicon Valley.

Hadoop is becoming popular for managing large volumes of data. More sites download Apache's Hadoop than any other release, said Baldeschwieler. But in a statement obtained afterward by InfoWorld, Cloudera COO Kirk Dunn reaffirmed support for his company's own Hadoop technologies. "Apache Hadoop absolutely is the foundation. Cloudera's Distribution Including Apache Hadoop is a 100-percent open-source platform that includes Apache Hadoop (not a fork or derivative, but actual Apache Hadoop). One of the things that Cloudera has pioneered is including not just Apache Hadoop, but also the full Hadoop stack -- Apache Pig, Apache Hive, Apache Hbase, Apache Flume, Apache Sqoop, Apache Zookeeper, Apache Whirr, and others -- which, when integrated and bundled, makes Hadoop more consumable and easier to manage. When deploying even moderately sized Hadoop clusters, these are non-trivial issues with which every enterprise has to deal."

The relationship between Cloudera and Hortonworks is "yet to be determined," Baldeschweiler said. "We've been working together on making Apache Hadoop great the last few years. We've had our differences." There is room for lots of players in this space, he said.

Baldeschwieler also stressed the potential for Hadoop and the level of interest in it in the enterprise and government worldwide. "We really believe that half the world's data will be stored in Apache Hadoop over the next five years." Hortonworks will focus on making Apache Hadoop "great," he said.