Feature - Technology roundup: Science gateways and portals may level the playing field
Our guest writer is Elizabeth Leake of TeraGrid - the high-performance, distributed computing network in the US.
At the recent EGEE User Forum in Sweden, attendees had a chance to learn about two general-purpose portal engines and two domain-oriented portals. What these technologies have in common is the ability to act as "science gateways," ultimately allowing new communities of researchers better access to advanced computing, thus leveling the playing field - a key part of "eScience." Here is a round-up of some of the features pf these gateways, and who is developing them:
Milan Prica from Sincrotrone Trieste, an independent laboratory in Italy, presented an advanced web portal with virtual collaboration features. The latest version of Virtual Control Room (VCR) includes a suite of open source tools, arranged in a dashboard format which provides simplified access to gLite grid resources for a variety of research areas; the latest version is based on the Gridsphere 3 and Google Web Toolkit (GWT). It's the main user interface adopted by the DORII project, which focuses on the applications of a diverse user base from three research communities, including Experimental Science, Environmental Science, and Earthquake/Seismic Science.
VCR uses MyProxy for credential management. With VCR 3.0, registered portal users can access grid resources using their personal certificates, or the portal's robot certificate, which is especially useful for occasional users. VCR's i2glogin java feature removes gLite-UI dependency. The streaming of scientific visualization video is facilitated by GVid. Using both client- and server-side functionality, users can remotely run and render applications and encode graphical output into a video stream (server-side), and then view the video on their desktop using a locally-installed java applet (client-side).
One of the most popular features of a science gateway is that it allows users the ability to access grid functions via portals, such as workflow programming, without having to install grid tools on their workstations. In many cases the underlying grid infrastructure is hidden. Peter Kucsuk of the Hungarian Academy of Sciences presented the P-GRADE grid portal family which offers this functionality.
The P-GRADE family of products offer an open-source, graphical user interface that is generic-purpose and workflow-oriented. P-GRADE supports workflows composed of sequential jobs, parallel jobs, and application services on grid systems built with Globus, EGEE (LCG or gLite), and ARC middleware technologies. Workflows and workflow-based parameter studies, defined in P-GRADE Grid Portal, are portable between Grid platforms without users having to learn new systems or re-engineer program code. Because of P-GRADE's robustness and user-friendly interface, it has been adopted by multiple national, regional, and application-specific grids.
Tapani Kinnunen of CSC/IT Center for Science, presented SOMA2, an open-source, domain-oriented research gateway. SOMA2's simple user interface hides all technicalities from the end-user. The system automates repetitive tasks which eliminates the need for redundant work on the part of the user. The framework also accommodates multiple platforms making it accessible to more users. The web browser-operated molecular modeling workflow environment was developed and deployed by Finland's CSC.
SOMA2 was developed specifically to enable molecular modeling applications, including molecular data exchange. It offers a secure and personalized framework for inputting molecular data, submitting and controlling jobs, and analyzing results. A unique modular design allows SOMA2 to accept third-party scientific applications, in the form of pluggable capsules, to execute programs and process data. A capsule consists of an XML description based on an original schema, and used to generate an application web form, scripts, and file templates. SOMA2 enables communication and data exchange between applications by employing Chemical Markup Language (CML). The source code is distributed under General Public License (GPL) agreements.
Advances in biomedical applications have resulted in the generation of vast data repositories. Mining and managing biomedical data is a complex procedure that requires several processing phases. This is especially important in the facilitation of images where proper preprocessing, image enhancement, color processing, feature extraction, and classification are critical. The e-LICO (e-Laboratory for Interdisciplinary Collaborative Research in Data Mining and Data-Intensive Sciences) project, funded by the Information Society Technology program of the European Commission, facilitates distributed data mining and management for the biomedical field.
Charalampos Doukas, University of the Aegean, presented e-LICO's web interface with access to a set of tools developed for the distributed data mining needs of the biomedical field. e-LICO's service-oriented architecture supports the concept of loosely-coupled, open-standard, language- and platform-independent systems. The latter allows service providers to modify backend functions while maintaining the same interface to clients. Because the user interface is accessed through HTTP/HTTPS protocols and uses eXtensible Markup Language (XML) for data exchange, it functions independently of platform-specific programming languages, tools, and network infrastructure.
Although SAGA wasn't presented to the EGEE user community as a science gateway, "it certainly has that potential," according to Shantenu Jha, Louisiana State University. Jha, a co-principal investigator at a US TeraGrid site, presented this simple, stable API which accommodates multiple middleware platforms, including gLite, ARC, UNICORE, and Globus. SAGA, together with a suite of resources, including Ganga and DIANE (Distributed Analysis Environment), makes it possible to achieve concurrent access to EGEE, TeraGrid, and other infrastructures.
There are hundreds of unique users of SAGA per month, according to Jha. Most are from the high energy physics realm, but about 30% are non-HEP applications which include Bio, Fusion, and various engineering fields of research. SAGA was demonstrated in Barcelona at the EGEE '09 user conference. RENKEI (the Japanese word for "Federation"), an eScience project of Naregi, the Japanese Grid Initiative, has developed an entire standards-based stack around SAGA. They will demonstrate their use of the tools at the Open Grid Forum meeting in Chicago on June 22, 2010.
The overall impression of these new tools?
"It was great to hear about new tools that are available to the global research community via science gateways" said Nancy Wilkins-Diehr of the San Diego Supercomputer Center, part of the TeraGrid network. "Tools designed to level the playing field, regardless of the back-end architecture - platform and middleware - are especially useful to the global research community. Everything that reduces the need for advanced technical knowledge is helpful, so researchers can focus on their work, not on computational science."
-Elizabeth Leake, TeraGrid External Relations Coordinator