Open Source Spotlight - Yabi: Bringing drag-and-drop to supercomputers

27.04.2012
Supercomputers are powerful tools for scientists. They are also very expensive, so wasted time can mean a lot of wasted resources. But making the most efficient use of them is not the easiest proposition in the world; it's not just a case of clicking a button to analyse a protein. However, fitting out the world of supercomputers with a user-friendly, web-based interface is the focus of an open source project based at Western Australia's Murdoch University.

Last year Murdoch publicly launched Yabi, a tool equipped with a web interface to make using supercomputers simpler.

The computational physics community, as an example, may be very proficient in the intricacies of shell scripts and working with a command line, says Professor Matthew Bellgard, Director of Murdoch's Centre for Comparative Genomics. "They've had a lot of experience in the past running their Fortran code using 4000 cores or 10,000 cores," he says. However, "there are other domains where scientists don't necessarily have that skill running command line code or porting their code from one supercomputer to another."

Learning at least a smattering of Perl or some other scripting language is often the norm for life scientists, but it can require a significant investment of time to get up to speed and make a supercomputer do what a scientist wants.

"When we started down this particular path of building Yabi, we wanted to simplify access to supercomputing infrastructure for end users. And the end users typically are non-IT-proficient; consider life science researchers or geoscientists," Professor Bellgard says. "While some of them have the ability to write scripts and use programming languages a lot would prefer to be able to just drag and drop and have access to tools that you could access via the command line, but in a web-based environment. "So I guess our first remit was, 'Can we simplify access to high performance computing (HPC) infrastructure?'"

Yabi has already been used to make life easier for scientists studying metagenomics (genomics is the study of the DNA of living organisms, metagenomics looks at the profile of organisms in a particular sample, for example a soil or sedimentary sample). "Metagenomics is a relatively new area and the tools are just being developed. The tools for data analysis of DNA sequences are readily available, but metagenomics is relatively new."