Back to Basics: Supercomputing
Just what is supercomputing, or "big iron?" How does it fit in alongside grid-, cloud-, and volunteer- computing in the world of science? E-ScienceTalk's Manisha Lalloo explains, in the latest GridBriefing (see bottom of page), just released last week. An excerpt is below.
Unlike grid computing, in which thousands of networked computers are linked together over widely dispersed geographic locations, a supercomputer consists of thousands of integrated computers in a single room. The use of supercomputers is commonly referred to as 'high performance computing' or HPC. This technology is at the frontline of processing capability, particularly in speed of calculation.
A supercomputer's power is measured by how many floating point operations it can complete per second, or "flops." The world's fastest machines are listed in the biannual Top500 list; in November, the Chinese Tianhe-1A system at the National Supercomputer Center in Tianjin topped the list, reaching 2.57 petaflops - the equivalent of everyone on Earth doing 367,000 calculations per second.
Thanks to their high processing speed and minimal delay in transmitting signals from one part of a computer to another - also known as "latency" - supercomputers are typically used for highly complex problems in which many intermediary results are dependent upon one another. Therefore, they are often used for tasks such as those found in astrophysics, climate modeling and biomedicine. HPC has also been used to simulate the structure of rat brains, test engineering structures and model global warming.
In Europe, PRACE (Partnership for Advanced Computing in Europe) provides users with access to supercomputers and centers across the continent. These include the JUGENE supercomputer at FZJ Jülich, Germany and the CURIE installed at CEA, Bruyères-le-Châtel, France. A third system, SuperMUC at Leibniz Supercomputing Center, Munich, Germany is under way.
In the US, the National Science Foundation's TeraGrid integrates high-performance computers, data resources and tools with high-end experimental facilities around the country. Currently, TeraGrid resources include more than a petaflop of computing capability and more than 30 petabytes of online and archival data storage, with rapid access and retrieval over high-performance networks.
HPC has the potential to aid researchers and society in areas from transportation to climate change. "To do this, researchers and engineers need access to world-class supercomputer resources and services to remain internationally competitive," said Thomas Eickermann, project manager of PRACE. "Therefore, Europe needs to continuously invest in this area to provide a leading-edge HPC infrastructure. Strategically, Europe also needs to strive for independent access to supercomputing, as it already has for aviation and space."
Initiatives such as PRACE and Teragrid help to expand scientific knowledge and in turn deliver social and economic benefits. However, cost, access, efficiency and training do become issues.
As supercomputers become more powerful, their energy consumption increases. Energy costs make up a large part of the expense of running a supercomputing, and cooling their processors is a particularly knotty problem. As the trend towards more powerful computers continues, the Green 500 list - which ranks computers according to their energy efficiency - is expected to take on new significance.
"I expect HPC power efficiency to be an extremely important field in the next few years," said Daniel Ahlin of the PDC Center for High Performance Computing at KTH, Sweden. "Heat-reuse has just begun to be explored more widely."
These are not the only challenges. As the complexity of scientific models increases, researchers' needs move from the petascale to the exascale - a thousand times more powerful than the current top-of-the-line petascale computers.
We also encounter fundamental physical limits of technology. Current machines cannot simply be made bigger; entirely new technologies are required. Engineers already acknowledge that exascale computers will have to rely on parallelism, along with compatible software. Because memory bandwidth and capacity do not follow Moore's Law, developers will have to become better at managing the constraints of less memory per core, and investigate novel algorithms.
About 100 million cores will be needed to deliver an exaflop, but at such large scale, frequent hardware failures become inevitable. Application developers must embrace this and factor it into algorithms and software. At the same time, it is difficult to physically interconnect those millions of cores.
Network reliability, latency and message size also need to be considered when designing the next generation of supercomputers. And even when solved, it is highly unlikely that one architecture will suit all applications.
With so many hurdles in the way, exascale computing will need dedicated time, effort and money. Experts believe that we could see the first exascale systems by 2018 but warn that they will require an estimated $1 billion of investment.
"Exascale computing brings major challenges that need to be met on the hardware, software and programming model level," said Sebastian von Alfthan, at the CSC IT Center for Science in Finland. "But it also offers opportunities for groundbreaking research at an unprecedented scale."
-Manisha Lalloo, e-ScienceTalk