What Hadoop can, and can't do

14.06.2012
The lure of using big data for your business is a strong one, and there is no brighter lure these days than Apache Hadoop, the scalable data storage platform that lies at the heart of many big data solutions.

But as attractive as Hadoop is, there is still a steep learning curve involved in understanding what role Hadoop can play for an organization, and how best to deploy it.

[ FREE DOWNLOAD: | ]

By understanding what Hadoop can, and can't do, you can get a clearer picture of how it can best be implemented in your own data center or cloud. From there, best practices can be laid out for a Hadoop deployment.

What Hadoop can't do

We're not going to spend a lot of time on what Hadoop is, since that's well covered in documentation and media sources. It's suffice to say that it's important to know the two major components of Hadoop: the Hadoop distributed file system for storage and the MapReduce framework that lets you perform batch analysis on whatever data you have stored within Hadoop. That data, notably, does not have to be structured -- which makes Hadoop ideal for analyzing and working with data from sources like social media, documents, and graphs: anything that can't easily fit within rows and columns.