The torrent of data is swelling fast enough that Brown plans to be running 200 servers by the end of the year--and without the right data-integration software, he thinks that number could double. Brown has been wading through an ocean of information ever since he joined ComScore as its first software engineer in 1999, shortly after the startup landed its original venture financing. Today the Internet market research firm reports $232 million in revenue a year. "Our growth is pretty darn linear and should continue," Brown says.
ComScore started off on a homegrown grid processing stack and in 2000 added Syncsort's data integration software, the current version of which is DMExpress. "We were up and running in weeks," Brown says. "It literally made our software run 5-10 times faster. You're not just adding storage, but you're adding compute as well."
In 2009, ComScore began migrating to Hadoop, becoming an early adopter of the technology, which has recently begun gaining traction in the enterprise market.
"We decided it was better to leverage the community than invest in building our own," Brown says. "In general, Hadoop is harder to bring into an enterprise when you have mixed operating systems. DMExpress, with their connector, is helping to solve this issue."
That's a typical experience, notes James Kobielus, in a recent report for Forrester Research, where he was an analyst. Hadoop, he wrote, "lacks some critical enterprise data warehouse features, such as real-time integration and robust high availability. The Hadoop market includes many vendors that have focused on these and other deficiencies in the core Hadoop stack. Vendors have, of necessity, either built proprietary extensions to address these requirements or have leveraged various NoSQL tools and open source code to provide the requisite functionality."