Petascale storage may trickle down to you

06.11.2006
Discussions about super computer performance almost always center on processing speed -- how many gazillion operations per second can be performed by the giant machines. Makers and users of supercomputers also like to brag about things like the number of processors, the amount of memory and the bandwidth available for moving data about.

Such metrics are important determinants of how much work the machines can do. Less often focused on, but becoming critically important, are questions of storage: How much disk capacity do the computers have? How fast can data be written to and read from storage? How easily and quickly can an application be restarted when a disk fails? How can file systems be scaled up to efficiently handle petabytes of information? How the heck can you find something when your system has 30,000 disks?

-- Those questions and more will become the focus of the Petascale Data Storage Institute (PDSI), which was recently founded by computer scientists at three universities and five of the U.S. Department of Energy national laboratories with a five-year, US$11 million DOE grant. 'The overall goal is to make storage more efficient, reliable, secure and easier to manage in systems with tens or hundreds of petabytes of data spread across tens of thousands of disk drives, possibly used by tens of thousands of clients,' says Ethan Miller, a computer science professor at the University of California, Santa Cruz.

-- That system may not much resemble the one used by your accounting department, but the computer scientists at the institute say -- and the vendor sponsors are hoping -- that new technologies from petascale storage research will trickle down to commercial users.

-- 'The use of high-performance computer clusters in many commercial applications, [such as] oil and gas, semiconductors and biotechnology, is growing substantially,' says Garth Gibson, a principalinvestigator for the PDSI and a professor at Carnegie Mellon University. He adds that companies are increasingly using supercomputers to boost revenues. 'High-performance computing is not so much about cost reduction as it is about improving the quality of products,' Gibson says.

Disk dilemmas