It's the end of an era in the U.S., as the National Science Foundation's TeraGrid program, which has provided cyberinfrastructure resources to the U.S. research community for more than 10 years, transforms into XSEDE (pronounced 'exceed') - Extreme Science and Engineering Discovery Environment.
It was definitely the hot topic of conversation at the TeraGrid11 conference in Salt Lake City in Utah last week. According to John Towns, leader of the new project XSEDE, the vision is to enhance the productivity of scientists and engineers, and to continue to provide the high quality user support that TeraGrid was renowned for.
But, in a shift in emphasis from TeraGrid, Town's vision doesn't specifically mention high performance computing at all - although, of course, it is vital part of the infrastructure.
iPlant, cybershake and the Human Connectome Project
XSEDE has been out to speak to communities to gather information about their requirements up until 2015, and there is definitely a lot of good science coming our way in the near future.
iPlant, for example, is dedicated to solving the grand challenges in plant science, and will need high-speed access to data in databases scattered in different places, plus a high performance computing component to do the analysis.
In Earth sciences, researchers are looking for support for their cybershake work in earthquake modelling, which requires a few big parallel jobs, and many thousand loosely coupled jobs. And projects such as the Human Connectome Project aim to understand the wiring of the human brain, a hugely complex problem. Brain researchers will have petabytes of data to archive and stream in near real time at 1 GB/s.
Of course, while the hot topic of keynotes and coffee breaks was XSEDE, there was a good deal of great science being presented at the TeraGrid11 as well.
Daniel Fink of the Cornell Lab of Ornithology was one such example with his talk about modelling avian distributional dynamics on TeraGrid.
Basically, Fink is bird watching with big computers. So why birds? First of all, he said, they are very good bio indicators, the so-called canaries in the coal mine, that can tell us about the health of the environment. Second, there is a lot of data - people love birds and there is a vast archive of amateur bird-watching data dating back over decades. One example of this is the citizen science website, eBirds.
When you're gathering observational data of this type it needs to be comprehensive - species, place, date, and time, as well as how long it took how many people to gather the data. This last one is particularly important as it gives researchers information on how much bias may exist in the data.
Filling in a hole in the data
Data sets can contain up to a million hours of volunteer data, but the downside is that it can be sparse in terms of geographical coverage, Fink said. There are holes in it where there just aren't many observers.
Fink and his team are now working to help fill the gaps and predict occurrence by associating observations with local environmental data. They've already tried this successfully with some different bird species such as the indigo bunting and the solitary sandpiper.
For example, their models of indigo bunting distribution show holes in the distribution patterns corresponding to cities. Which is what you would expect from a species that prefers hedgerows and rural environments.
Fink showed off his BirdVis tool that has recently been accepted for publication, where you can scroll through the influence of habitats during breeding periods and during migration, over dynamic time frames. Your humble correspondent was very impressed. Hopefully it's a sign of the great science that will continue to come out of XSEDE in the future.