Amazon automates Hadoop use for developers

02.04.2009
Amazon.com has launched a hosted service designed to simplify for developers the use of the Hadoop implementation of the MapReduce programming model for processing large data sets in processor clusters.

Called , the cloud computing service is aimed at developers whose applications need to crunch large amounts of data, for which Hadoop is ideally suited.

With Amazon Elastic MapReduce, many tasks that developers would need to handle manually related to Hadoop are automated, the company's Amazon Web Services (AWS) cloud computing division in an official blog on Thursday.

"Using Elastic MapReduce, you can create, run, monitor, and control Hadoop jobs with point-and-click ease. You don't have to go out and buy scads of hardware. You don't have to rack it, network it, or administer it. You don't have to worry about running out of resources or sharing them with other members of your organization. You don't have to monitor it, tune it, or spend time upgrading the system or application software on it," the blog posting reads.

It's still fairly complex to set up a Hadoop cluster, even with products that aim to streamline the process, like , said Lydia Leong, a Gartner analyst. "By doing this, Amazon is greatly simplifying access to Hadoop compute clusters," she said.

AWS decided to create this service after learning it has customers running Hadoop jobs on the Amazon Elastic Compute Cloud (EC2) service, which provides hosted computing capacity. Because Hadoop is becoming increasingly popular, Amazon aims to make it easier for other developers to take advantage of this open source implementation of MapReduce.