Data from the heavens

05.12.2005
Imagine that your data is as vast as the heavens, with the information as complex as galaxies themselves.

Most organizations would find such imagery dizzying, but for the European Southern Observatory, it's reality. And ESO has it under control.

The organization developed a system to manage data from outer space. Its end-to-end data-flow system is a collection of tools and processes that helps the agency to serve researchers who use its telescopes and archives.

The success of its system has earned ESO recognition from the Computerworld Honors program as well as other observatories, whose officials see ESO as a leader in this field.

"It's something that we all acknowledge as being important," says Todd Boroson, deputy director of the National Optical Astronomy Observatory in Tucson, Ariz., and associate director for the NOAO Data Products Program.

ESO officials saw a chance to improve the way researchers collect, use and share data when the agency started to build the Very Large Telescope cluster in Chile more than 10 years ago. In 1995, they created the data management and operations division to move the 11-country ESO toward those goals.

"What we're trying to do is create a system that allows research astronomers to have greater efficiency in conducting their observations and turning that into a science result," says David Silva, head of the data-flow operations department.

The IT team borrowed from advances in data collection made possible by the Hubble Space Telescope as well as relational databases developed in the banking industry.

"One of our strengths is we look at the commercial market and try to find solutions that are applicable for us," Silva says. How It Works

ESO developed a system with several major components. Web-based interfaces provide information and tools to end users -- the researchers. These interfaces allow researchers to submit detailed observation requests that ESO workers in Chile can then execute. The system stores 30TB of data.

The main user tool for submitting programs is a Java-based client that runs on the astronomers' desktops and exchanges data with a server located at ESO headquarters. ESO operations staffers use another Java-based tool to manage and execute user programs.

Meanwhile, an enterprise-class relational database management system is used to operate and synchronize the databases in Germany and Chile.

ESO's tech team used a combination of off-the-shelf products and internally developed pieces for areas where customization made the most sense. The system was based primarily on Sun Solaris and Hewlett-Packard Co. technologies, but now the IT team is moving toward Linux running on Dell Inc. hardware. ESO also uses data management products from Sybase Inc., which has database expertise and has worked with other astronomical agencies.

ESO invested about $60 million in the IT infrastructure between 1995 and 2001, according to Peter Quinn, head of ESO's data management and operations division. Most of that investment -- nearly $50 million -- went to labor costs, with the rest going to equipment. ESO now invests about $12 million annually in development, maintenance and operations.

"They've done a thorough job providing an IT infrastructure to ensure they're using the telescopes in an efficient way," notes Daniel Steeghs, an astronomer at the Harvard -- Smithsonian Center for Astrophysics in Cambridge, Mass., and a native of the Netherlands who has worked on ESO's systems.

Despite its successes, ESO has challenges ahead in managing its ever -- increasing volume of data. Officials say they're developing cluster computing technology to meet predicted needs; ESO must have the capacity to store and process close to 1TB of science data per day by 2010.

But such challenges haven't stopped others -- including those pushing for an international virtual observatory -- from following ESO's lead in end-to-end data-flow management.

As Steeghs says, "People have recognized that it's the way to go."

Sidebar: Conquering cultural resistance

Despite its governmental origins and academic bent, ESO faced a corporatelike problem when implementing its data-flow system six years ago: getting user buy-in.

For centuries, astronomers made their observations using their own telescopes, and even now they often travel alone to remote locations to conduct their studies. Yet the weather can ruin planned observations, leaving scientists who waited months or years for a brief turn at a shared telescope empty-handed. In addition, astronomers are unlikely to share the data they do capture on those powerful telescopes.

ESO wanted to address those problems, so officials asked researchers to use the data-flow system when they conduct observations. Researchers submit their observation plans in advance, and computers sort them and put them in a queue based on factors such as the weather conditions at the site of the Chilean telescopes. Observations are sent back via the Internet or hard disk, with data stored for general use after one year of proprietary use.

"We had to win people over, but it didn't take long," says Peter Quinn, head of ESO's data management and operations division.

ESO established a user support department early on to ensure that clients get the service they need. Meanwhile, ESO's IT staff collected requirements from users and converted those ideas into tools, says Michele Peron, head of the data-flow system department.

Still, there were some bumps, says David Silva, who heads the data-flow operations department. "In the beginning, people hated us," he says, adding that researchers complained in letters to the director general.

Instead of becoming defensive, the IT team worked out problems.

"As with all new things, there is a learning curve, and the first few times, it was difficult," says Frank Grundahl, a researcher at the University of Aarhus in Denmark who uses the system. "However, the very clear structure that has been set up, including the extensive documentation, now makes things very efficient."