Monday 29 September marks the 60th anniversary of the founding of the European Organization for Nuclear Research, otherwise known as CERN, located near Geneva, Switzerland. A new book, entitled 'From Physics to Daily Life', highlights some of the many ways in which technology developed at CERN has impacted upon society at large. The following article is an excerpt from a chapter in this book authored by Bob Jones, the head of CERN openlab...
Bob Jones, Grid and Cloud, in B. Bressan (ed.), From Physics to Daily Life: Applications in Informatics, Energy, and Environment, p. 81, Wiley 2014. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with permission.
The public research community, the European Commission, and governments have invested heavily in data‐supporting infrastructures in the past decade. Distributed computing infrastructures such as grids and clouds are critical because science is fundamental to addressing the problems confronting our planet and our lives. Data is fundamental to science. The science we do now requires ever-increasing data sets. Flexible, powerful computing systems are required to support this. Computing does not have the power to save our planet from global warming or energy shortages - it does, though, have an underlying role to play in making it happen. For example, it has never been more important to have powerful and accurate climate information.
Elaborate computer models are the primary tool used by climate scientists and bodies like the IPCC (Intergovernmental Panel on Climate Change) to report on the status and probable future of our Earth. Models like those used by the IPCC need data from the atmosphere, land surface, ocean and sea ice, all originating from different communities, along with diverse accompanying metadata (data which describes data). The amount of data that climate scientists need to manage is enormous - on the petascale, yet a broad and global community needs to be able to access and analyze it. This is ideal for a grid solution - where information stored around the world can be woven together without moving databases. The ESGF (Earth System Grid Federation) is a project sponsored by the US DoE (Department of Energy), the NSF (National Science Foundation) and the NOAA (National Oceanic and Atmospheric Administration) that will support the data for the next generation of climate models used by the IPCC in their next assessment.
Consolidating IT infrastructures - for business and research
Recent developments in IT have made it possible to consolidate IT infrastructure in a way that delivers increased flexibility and responsiveness to business needs while reducing costs. This change involves a move from IT being provided individually by organizations procuring their own separate IT infrastructure, to a new model in which IT is provided as a utility, which is known as 'cloud computing'. The NIST (National Institute of Standards and Technology) defines cloud computing as "a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction."
Since the late 80s, all science domains have the following processes in common: data collection, processing, and analysis. This is applicable to the entire domain of laboratories or research environments and to the science community affiliated to them. Furthermore, for the data collection, extreme-scale laboratories, instruments such as LHC (Large Hadron Collider) or JET (Joint European Torus), satellites and molecular biology have their own large scale IT systems to collect, process, and analyze their data. Until today, every player has been creating an enormous amount of duplicated IT infrastructure and data assets, and this is introducing barriers to collaboration and avoiding synergetic exploitation of data and models.
Over the last few years, consumer-facing firms delivering products in large volumes have actively adopted cloud computing. Commercial service providers are expanding their available cloud offerings to include the entire stack of IT hardware and software infrastructure, middleware platforms, application system components, software services, and turn-key applications. The private sector has taken advantage of these technologies to improve resource utilization, increase service responsiveness, and accrue meaningful benefits in efficiency, agility, and innovation. Similarly, for research organizations, cloud computing holds significant potential to deliver public value by increasing operational efficiency and responding faster to constituent needs.
By leveraging shared infrastructure and economies of scale, cloud computing presents a compelling business model. Organizations will be able to measure and pay for only the IT resources they consume, increase or decrease their usage to match requirements and budget constraints, and leverage the shared underlying capacity of IT resources via networks. Resources needed to support mission-critical capabilities can be provisioned more rapidly and with minimal overhead and routine provider interaction.
Helix Nebula: a cloud for science in Europe
The Helix Nebula initiative combines the computing needs of CERN, ESA (European Space Agency), and EMBL (European Molecular Biology Laboratory) with the capabilities of several IT companies. Working together, these two groups are addressing the technical, legal, and procedural issues that today make it difficult to seamlessly move jobs from one cloud to another at scale.
Helix Nebula aims to pave the way for the development and exploitation of a cloud computing infrastructure, initially based on the needs of European IT-intense scientific research organizations, while also allowing the inclusion of other stakeholders' needs (enterprises, governments, and society). The scale and complexity of services needed to satisfy the foreseen needs of Europe's IT-intense scientific research organizations are beyond what can be provided by any single company and hence will require the collaboration of a variety of service providers.
The initiative is a learning process to understand what a formal procurement of cloud computing would involve. For publicly funded institutions, monetary transparency, accountability, and value are increasingly important, and Helix Nebula offers an environment in which potentially complex questions around SLAs (service-level agreements), intercloud transfers, and more can be explored ahead of procuring a more lasting solution.
Not only are the questions the initiative is addressing likely to be of value but the relationships that are being built between the different research communities and the IT industry may well prove to be its most valuable and long-lasting asset.
Worldwide LHC Computing Grid: a global collaboration
CERN and its scientific program have been a driving force that has achieved the largest collaborative production grid infrastructure in the world for e‐science. Providing seamless access to a vast computing resource, 24 hours seven days, it has demonstrated that such a production infrastructure can be used by a wide range of research disciplines, producing scientific results which without grid, would not have been possible to achieve. Through such a structure, scientists were able to do more science and on a larger scale, and get results in a shorter time frame. It has spawned collaborations within Europe and allowed Europe to collaborate as a whole with other regions worldwide. Grid computing has acted as a good showcase of what is possible with distributed computing infrastructures, and laid an important foundation for future forms of utility computing including clouds which are having an immense impact on many business sectors and IT industry.
On 26 September, CERN will host a special colloquium, during which authors featured in'From Physics to Daily Life' will present several examples of successful cross-disciplinary technology transfer originating in fundamental physics research. The event will start at 08:45 CEST (06:45 UTC) and will be available to view online here.