Last month, CERNplayed host to the 2nd Workshop on Energy for Sustainable Science at Research Infrastructures. Delegates discussed how various aspects of large-scale research infrastructures could be made more energy efficient, including the computing centers which underpin the science carried out.
Computer scientists across the globe are working towards creating the first exascale supercomputers. However, energy consumption is likely to prove a major barrier to achieving this. Simon McIntosh-Smith of the University of Bristol, UK, reported at the workshop that the average power consumption of the top ten high-performance computing (HPC) systems in the Top500 list has increased five times over the last five years, while the average of the whole Top500 list has increased more than three times over the same period. “This isn't very sustainable,” he says. “Nobody yet knows how to build a sustainable exaFLOPS supercomputer.”
“Getting the power isn't necessarily the problem,” explains McIntosh-Smith. “The problem is really the cost: not just in terms of euros or dollars, but also in terms of CO2.” He adds: “A 10MW system costs about $10m per year to run in terms of power costs.”
McIntosh-Smith is also a member of Europe's Mont-Blanc project, which aims 'to produce a new type of computer architecture capable of setting future global HPC standards that will provide exascale performance using 15 to 30 times less energy'. He and his project colleagues believe that energy-efficient mobile processors could be the key to reducing power consumption in HPC.
During the workshop, McIntosh-Smith described a prototype ARM multicore cluster called Tibidabo, which was created at the Barcelona Supercomputing Center to demonstrate that it is possible to deploy a cluster of smartphone processors. He draws a parallel with the rise of microprocessors in HPC in the early nineties. “Microprocessors killed the vector supercomputers despite not being faster… but they were significantly cheaper and more energy efficient,” he says. “History may repeat itself with mobile processors.”
Another mini-revolution within HPC is the adoption of warm-water cooling technologies. This was discussed at the workshop by Bruno Michel of IBM, who explained how warm-water cooling technologies for HPC allow the water to be reused for heating buildings. While the cooling gradient is, of course, much less with warm water, Michel argues that the economic value of the heat has the potential to reduce the total costs of data center ownership by 50-70%. He cited the example of the SuperMUC supercomputer at the Leipzig Supercomputing Center in Germany, from which 90% of the heat energy is available for reuse. Michel says that this technology doesn't just have the potential to be useful in cooler climates, but also in warmer regions of the globe, where the heat energy could be used for other purposes than heating, such as adsorption cooling or desalination.
Thomas Schulthess, director of the Swiss National Computing Centre, also gave a presentation at the workshop. He discussed a new supercomputer called Piz Daint, which is cooled using water pumped through a 2.5-kilometer pipe from Lake Lugano. With the need for pre-coolers completely eliminated, the only energy used for cooling the machine is that expended to pump in the water. Schulthess also explained how a new purpose-built computer center to house Piz Daint has led to an energy efficiency increase of 1.5 times over the previous building. CERN's Wayne Salter also talked about energy efficiency savings in relation to the buildings which house computing equipment. “With some fairly simple and straightforward changes, we've significantly increased our energy efficiency and reduced our energy bill,” says Salter. He explained how the strict separation of hot and cold air, achieved through the installation of enclosed cold aisles of racks, as well as more extensive use of so-called free cooling and increasing the allowable temperature range for the servers, has led to significant savings. “All servers are comfortably cooled all year round with much less need for chiller use,” he adds. Helfried Burckhart, CERN's energy coordinator, reports that the CERN data centre is responsible for just 2% of organization's 1.2-terawatt-hour total annual energy consumption. Ehsaan Farsimadan from Romonet agrees that the architecture of computer centers has a key role to play in achieving maximum energy efficiency. “Cooling should not be a significant overhead,” he says. “Correctly designed data centers can eliminate the need for cooling.”
At the start of his presentation, Farsimadan warned that the ICT industry is responsible for over 2% of global carbon dioxide emissions. Thus, given the scale of the energy consumption involved, Michel argues that there is a pressing need for revolutionary change in the sector. He argues for a major shift away from measuring and incentivizing the absolute performance of supercomputers, towards a metric of performance per Joule. Michel also advocates the stacking of chips, thereby abandoning the 2D-constraints of Moore's law. This 3D integration would see main memory moved into the chip stack, with intricate interlayer cooling. Along with several other speakers at the workshop, Michel highlighted the fact that improved memory proximity has the potential to lead to significant energy savings, with the 3D integration he outlines being one way of achieving this. “We are currently using 99% of energy for communication and only 1% for computation,” says Michel.