HP's Autonomy connects to Hadoop

04.06.2012
Hewlett-Packard's Autonomy subsidiary will release an add-on component to link the company's IDOL flagship search software to the Apache Hadoop data processing platform, it announced Monday as part of its HP Discover user conference this week in Las Vegas.

While Hadoop provides a good platform holding vast amounts of information, it offers little in the way of prebuilt analysis tools, said Matt Malden, Autonomy vice president. Organizations must write their own Java programs in the MapReduce framework to analyze their data.

With Autonomy's Hadoop package, users can instead embed an IDOL 10 engine in each node of their Hadoop cluster. They then can use IDOL's 500 functions to analyze and summarize data on the Hadoop implementation.

Autonomy's IDOL (Intelligent Data Operating Layer) provides enterprise users with the ability to conduct complex queries across large amounts of unstructured data, such as Web pages, email and digitized office documents. Over 400 organizations use this software, according to the company.

All the functionality in IDOL itself can be applied to a Hadoop dataset, Malden said. The software offers such functionality as concept searching, where a search on one word will return results containing items with synonyms to that word. It can do sentiment analysis, offering a summary of how negative or positive the information in a set of documents may be. Such sentiment analysis can be used understand user satisfaction levels, perhaps over a select period of time. IDOL can also offer conceptual clustering, whereby it groups documents under broad themes, potentially simplifying a search process.

The pairing of Hadoop and IDOL was a natural fit, Malden said. "You don't need to move data into IDOL to use its functions. Whatever technology choice you make for data storage, we are able to process it," he said, adding that IDOL has 400 connectors to various other platforms, and can understand over 1,000 different data formats.