While astronomical data from major international observatories, such as the European Southern Observatory and NASA's Great Observatories, is precisely managed and freely available, there's still no shortage of interesting and new phenomena to discover. Trying to understand how the universe works is no small feat. Only by capturing and analyzing extreme amounts of data are insights and discoveries possible.
Before the rise of charge-coupled devices (CCDs) and digital photography in the 80s and 90s, astronomers gathered data by looking through basic telescopes and taking notes - or by reviewing photographic plates of what the telescope captured. Digital photography has helped astronomers acquire and store massive amounts of data. As a result, astronomy is one of the most data-intensive sciences.
The WIYN Observatory - a collaboration among the Universities of Wisconsin-Madison, Indiana, Yale, and the National Optical Astronomy Observatory, all located in the US - hosts a 3.5-meter telescope atop Kitt Peak in southern Arizona, US. The One Degree Imager (ODI) is a key piece of the WIYN Consortium's new instruments initiative. In addition, the One Degree Imager Portal, Pipeline, and Archive (ODI-PPA) is an online science gateway that provides astronomers with a single point of access to ODI data and rich computational and visualization capabilities.
"Many astronomers are still accustomed to going to an observatory directly to take images, and - depending on the size and number of images captured - bringing them back on a USB drive, a CD or DVD, or even a hard drive, to reduce them by hand over several days or even weeks," says Arvind Gopu, project manager for the ODI-PPA.
Together with the ever-increasing quality and size of images, the primitive workflow of 'handling' data will eventually fade away. Scientists are moving into a time when most raw data is typically reduced, and made available online - as are apps for data visualization, analysis, and imaging. Currently, astronomers use a well-known visual analysis tool called DS9, which takes its name from the logical extension of the Star Trek series. It's a mature product, but it must be downloaded and installed on a Windows, Mac, or Linux machine.
"It was essential that we replicate the functionality of DS9 in ODI-PPA," explains Gopu. "The idea is to eliminate the need to download every single image." Gopu worked with WIYN stakeholders to develop paradigm-shifting ODI-PPA features that go well beyond what's expected.
"With ODI-PPA you can create a 'cart' or a collection to collate data products, and run data processing jobs or download images," Gopu explains. Users can view and analyze a typical image using the Image Explorer feature, as well as the relevant source exposures that create it. "Right now users are able to annotate a collection, and we're adding the ability to annotate individual images."
Current ODI images can be as large as 700 megabytes. Images twice this size are expected in 2015 when ODI is expanded. Gopu says that they've now implemented preprocessing to generate lossless Portable Network Graphics (PNG) tiles. "As you zoom out, more and more tiles are brought in at each level, similar to the Google Maps approach. We generate about 6 to 10 levels of tiles; you can actually go to the cell level with this zoom."
Using a decent computer, an ODI-PPA user could have more than 10 instances of a full-resolution image open in their browser - something that would be impossible with the original image. Even if you downloaded everything, you would eventually run out of memory.
All of the DS9-type functionality is great, but only scientifically useful up to a certain extent. Ultimately, it serves as a sieve to eliminate errant images or bad data, and to boil everything down to useful information. The power of this portal is in its framework for interactive astronomy analysis, which enables interaction with and analyses against the original image, not the PNG version.
Researchers can run statistical analysis and source extraction against original image data to determine if the image is a star, a galaxy, or some other source of light. Computation takes place on Big Red II - the 1 petaFLOPS supercomputer of Indiana Universtiy, US. Tools already integrated into ODI-PPA include two calibration pipelines - QuickReduce and one other developed at the National Optical Astronomy Observatory - as well as SExtractor for catalog generation, and SWarp for stacking.
The results are delivered within seconds or minutes, thanks to the framework essentially bypassing the computation resource allocation process that is common on many high-performance clusters. In turn, the framework can be used to run jobs that last for hours - there's little difference.
"Our team, including technical design architect Soichi Hayashi and developer Michael Young, has created a collection of frameworks within a larger framework," says Gopu. "Hopefully it can be applied to other projects in other scientific domains. The conceptual elements are all the same - a mass of data, config files needed to run code, and results that are passed back."