With many projects involved, truly seamlesss interoperability can be a challenge.
"Standard" is often equated with "average" or "boring." How can you innovate or invent when you're bound by standards and regulations? How can you push the boundaries when you're stuck inside a box?
Yet how can you create something on a grand scale-something that can slot into place with other grand things-unless you create something interoperable. Something . . . standard.
In this special feature, former iSGTW editor (and now GridTalk editor) Cristy Burne reports on this easily overlooked aspect of grid computing.
Why should we care?
Standardizing grids: the current landscape
Challenges for the future
The way forward
A standard in action: GridFTP
A "de facto" standard: VOMS
BONUS FEATURE: What does the grid community have to say about standards?
(See what people from institutions as diverse as e-Bay and OGF think!)
In 1850s Australia, budding railroad tycoons began laying train tracks across the continent. Each team of financiers, surveyers and civil engineers adopted their own preferred system, independent of the others.
Australia developed horribly incompatible train lines-some had 4 feet 8 1/2 inches between the parallel rails of steel, some had 2 feet between them, some had 3 feet 6 inches . . .
By 1917, you needed to change trains six (!) times to get from Brisbane on the east coast to Perth on the west. Many decades later, after enormous effort and expense, enough of these track systems were rebuilt to follow a single, universally accepted standard gauge that you could now cross Australia on a single train without having to change cars.
But will the same interoperability and ease of transit ever be true of data on a computing grid?
In early June, two important meetings of the grid community were held in Barcelona: the 23rd Open Grid Forum and the 5th e-Infrastructure Concertation Meeting. Both were dedicated to standardizing the grid. Debate was rife, but a strong message emerged-Europe, and the ICT world, cannot afford to repeat the incompatibilities of the early Australian train situation.
|Staring down railroad tracks in Guthrie, Oklahoma, just before dusk.
Image courtesy of Patrick Moore, sxc.hu
While "the Grid" in its idealized form is a single interconnected, interoperating computer farm, the reality of grid computing is very different. Instead of a single all-powerful Grid, there are many smaller grids, each customized to the specific needs of a user group. These different needs have led to different technical solutions: just as a toaster from the United States won't automatically work in a kitchen in Great Britain, grid solutions developed for one grid don't always work for another.
The challenge is clear: if grids are to be widely adopted-if they are to offer real solutions for industry and e-science-then they must be interoperable, which means the development of standardized, transferable technologies. Such technologies usually develop in one of two ways: de facto standards, like using Google for web searches, seem to develop themselves. Meanwhile, formal standards, like the meter or the kilogram, require consensus within a user community.
In the grid world, the Open Grid Forum is the largest group working towards standards adoption. The OGF provides a global opportunity for volunteers from all walks of grid computing life to contribute to developing new standards. The process sounds simple: first, a group works to develop best practices in a particular area, then, the group approaches OGF for endorsement of that work as a particular standard. Or, in reverse, an area of interest is first identified, and then an group is formed to work on a standards solution in that area. These processes may sound simple, but in practice, the path to achieving an accepted, implemented standard is long and dotted with potholes.
An example of the need for communications and standards to go hand-in-hand is the 1999 Mars Climate Orbiter. Now "lost in space," the orbiter completed a 286-day journey to Mars by firing its engine in the wrong direction. The mishap was caused by confusion between members of the geographically distributed team, leading to use of both Metric and non-Metric units in crucial calculations.
Image courtesy of Rybson, sxc.hu
Challenges for the future
In addition to technical challenges, standardization can introduce issues such as different user requirements, incompatible policies and poor market timing. A classic example is that of videocassette standards, where Sony's earlier and arguably superior technology, Betamax, was outdone by VHS, a cheaper option that better served the rental movie market. (Both standards are now obsolete due to the rise of digital technologies. This highlights another difficulty facing grid computing: a rapidly changing marketplace makes it hard to pin down a strict standard.)
Despite these challenges, the benefits of standardization are very tangible. Standardization translates to interoperability, which encourages collaboration, competition and sustainability. The implementation of a popular, functioning standard leads to smooth technology transfer, reliability and ease of use. In the medical industry, for example, adoption of the DICOM digital imaging standard enabled physicians anywhere in the world to interchangeably send, receive and store medical images. The use of Latin (and the organizational system of Karl von Linnaeus) when naming biological species means all biologists speak the same taxonomical language. And the advent of HTTP as a communications protocol has helped fuel the massive growth of the World Wide Web. Equally, the success of network standards such as Ethernet has not repressed healthy commercial competition in the network equipment market.
Physicists used the Globus Toolkit and MPICH-G2 to harness the power of multiple supercomputers to simulate the gravitational effects of black hole collisions.
No one can force a community to build middleware to a particular specification, or to adopt a particular security policy. Standardization relies on grid users and builders choosing to implement a solution that works most of the time, for most of the people.
At both OGF23 and the 5th e-Infrastructure Concertation Meeting, when the floor was opened for discussion, there were many questions yet to be answered: Who will pay to test for standards compliance? If we're not testing these standards, why bother to create them? And how can we ensure that we create standards that enforce best practice, when we're still learning what those best practices are?
Discussion is ongoing, but several things are certain: Grid computing requires standards at the industry level, with a validation framework that reinforces continued efforts towards software quality. Standards developers need to think long-term-beyond the lifespan of a typical project-to allow the time and energy required for a standard to mature. And standards developers need open corridors for communication, with different projects, different standards bodies and different user groups.
From 2007 to 2014 (FP7), the European Commission will invest 50.5 billion Euros in hundreds of R&D projects. These projects need to be compatible not only with each other, but with other solutions adopted around the globe. Gaby Lenhart of ETSI (see "BONUS FEATURE-Readers talk back" below) says some 32.4 billion Euros of FP7 money will be needed to address standardization issues.
Wooden hinged ruler, found in an old workshop.
Grid computing provides the power needed to run data-intensive scientific applications such as drug discovery or high-energy physics. As part of this, massive amounts of data must be shunted around the world at high speed. Although there are many different ways of storing and partitioning such data, the grid community has agreed on just one way of transferring it: GridFTP.
Also known as the "grid file transfer protocol," GridFTP is the accepted method for securely and reliably transferring large volumes of data across distributed computing grids. It is based upon standard Internet FTP protocol but tailored to support the special needs of grid computing-including authentication and confidentiality features, reliability and fault tolerance, and third party and partial file transfer.
|Funded by European Union, the DataGrid project goal was to build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific communities.
Image courtesy of The DataGrid Project
"Virtual organizations" are the human backbone of grid computing: groups of researchers from around the world who collaborate on common challenges, using grids to share and integrate their data and resources. The Virtual Organization Management System, or VOMS, is a system that allows distributed VOs to centrally manage the roles and authorizations of their members. Using VOMS, site administrators can generate local credentials for specific VO members, providing them with a single login and access to VO grid resources for a set time.
-Cristy Burne, GridTalk
*Want to hear what others think? See our bonus feature: Readers Talk Back-What does the grid community have to say about standards?