What Hadoop can, and can't do

14.06.2012

· A dedicated switching infrastructure to avoid Hadoop saturating the network

· 4 to 12 drives per machine, Non-RAID

, another Hadoop distributor, has similar specs, though it is a little more vague on the network stats, because of the varying workloads any given organization can apply to their Hadoop instance.

"As a rule of thumb, watch the ratio of network-to-computer cost and aim for network cost being somewhere around 20% of your total cost. Network costs should include your complete network, core switches, rack switches, any network cards needed, etc.," .

For its part, Cloudera estimates anywhere from $3,000 to $7,000 per node, depending on what you settle on for each node.