IT struggles with climate change

06.02.2006
Corporate IT managers who never seem to have enough CPU power, disk space, bandwidth or funding might take comfort from U.S. climate scientists. Computerworld's Gary H. Anthes recently talked with two of them and learned that even having access to the world's most powerful information systems is not enough.

Patrick Heimbach is a research scientist in physical oceanography at MIT, and James Hack heads up climate modeling at the National Center for Atmospheric Research in Boulder, Colo. Both scientists use their own organizations' computer systems, as well as those at supercomputer centers around the U.S.

What are you working on at MIT?

Heimbach: We are trying to see if we can simulate, if we can understand, what the ocean has been doing over the last couple of decades. Are we heading toward a warmer world? Is [warming due to] internal variability of the [oceanic and atmospheric] system, or is there something we are doing to the system?

Do you have the computational power to do that?

Heimbach: What we ultimately would like to run we can't currently fit on any computer. We would need on the order of 20,000 processors, and probably two orders of magnitude faster processors. Each supercomputer center allocates a certain amount of computing time to a specific group. So we have to size down the problem we are addressing for that specific machine.

So it seems you must beg, borrow and steal computer resources for this work.

Heimbach: We have to find the cycles where we can find them. But even for the machines that are available, if we really wanted to go to the actual [spatial] resolutions that we need, we probably would not be able to fit those problems on those machines. Give us any machine, and we can immediately fill it with an interesting problem, and we'll still have the feeling we are limited.

Hack: Climate and weather applications... push high-performance computer technology. A decade ago, global climate applications benefited from the extraordinary memory bandwidth of proprietary high-performance architectures, like the parallel vector architectures from Cray and NEC. As scientific computing migrated toward the commodity platforms, interconnect technology, both in terms of bandwidth and latency, became the limiting factor on application performance and continues to be a performance bottleneck.

Is the Internet adequate for connecting you to the supercomputer centers you use around the U.S.?

Heimbach: Transferring several terabytes of data from NASA Ames [Research Center] to MIT is just overwhelming to do in a reasonable time. As of a year ago, we were limited by the 100Mbit/sec. bandwidth of the network that connects our department to the outside world. The best sustained rates that could be achieved were on the order of 55Mbit/sec. That would bring us to a transfer time of 1.7 days per 1TB of data.

We now have better connection to the high-speed Internet2 Abilene network, with its 10Gbit/sec. cross-country backbone. The bottom line still is we need much higher bandwidth, less network congestion and smart transfer protocols, such as the large-files transfer protocol [bbFTP], that minimize CPU load.

Hack: The so-called sneakernet continues to provide the best bandwidth for moving large data sets between computing centers -- shipping data on tapes or disk via overnight services. We are engaged in some emerging computational projects that will generate hundreds of terabytes per experiment. Moving that data is a significant challenge. Storage and access to that data for analysis purposes is a comparably challenging technical task.

How adequate is supercomputer capacity in the U.S. for scientific research?

Hack: One could argue that there will never be enough supercomputing capacity. In [a] sense, scientific progress is paced by the availability of high- performance computing cycles. And the problem becomes more acute as the need to address nonlinear scientific problems in other disciplines, like material science, computational chemistry and computational biology, continues to grow.

There remains some controversy about global warming. Could better climate models and/or better computer technology help resolve that?

Hack: For many scientists, it's not a question of whether the planet will warm, but more a question of how much the planet will warm and what form the regional distribution of that warming will take. Answering... these questions will require additional levels of sophistication in global climate models, such as improved resolution and extending existing modeling frameworks to include fully interactive chemical and biogeochemical processes. These kinds of extensions are... extremely expensive in computational terms. We will require a minimum of a twenty-five-fold improvement in computational technology to enable the next-generation model [in] three to five years.

Heimbach: You need to run coupled ocean-atmosphere simulations over 10 to 100 years. We think that these models, and the underlying model errors, are still such that we need to do more basic research to understand the errors better. That's what we are trying to address.

Should the federal government be doing more to fund supercomputer research and supercomputer capacity?

Hack: The federal government should treat supercomputer technology in the same way that it treats other strategically important areas, like those related to national defense and national security. It's too important to the nation's scientific and economic competitiveness to be left to chance.

Sidebar

Big iron, big pipes, big problems

Here are just a few of the systems that climate scientists are using today to research global warming trends. MIT's Patrick Heimbach says his goal is to have access to 20,000 processors that are each 100 times faster than what he's using today.

MIT

Alliance for Computational Earth Science (ACES) PC Grid

-- 500 Pentium 4 CPUs

-- 1TB of memory and 10TB of near-line storage

-- User sites connected by 10Gbit/sec. Ethernet. PCs connected by 1Gbit/sec. Ethernet

National Center for Atmospheric Research

Blue Sky

-- 1,600, 1.3-GHz Power4 CPUs

-- 2GB memory per processor

-- 8.3TFLOPS peak performance

-- 31TB disk capacity

NASA Ames Research Center, Moffett Field, Calif.

Silicon Graphics Altix 3000

-- 20 Altix or Vortex systems, each with 512 1.5-GHz Itanium 2 CPUs and 3.8MB to 7.6MB of memory

Abilene Network

-- An Internet2 backbone IP network to 220 universities and research labs

-- OC-192c (10Gbit/sec.) backbone employing optical transport technology and advanced high-performance routers