Feature - Indiana Universities Prepare for New Speed Record
Many people associate Indiana with high-speed car racing. But universities in Indiana are working to build a new reputation: one for high-speed, high-throughput computing.
As the world gears up for more speed records at the U.S. Formula 1 Grand Prix in Indianapolis this month, Indianan universities are gearing up to help create new records for handling data: they are part of an international collaboration preparing to manage the petabytes of data that will soon be generated by the largest physics experiment ever.
The experiment has a name only a scientist could love: the CMS project, which stands for compact muon solenoid, a type of electromagnet.
The CMS is currently being constructed at CERN, the European Organization for Nuclear Research. CERN is the Tier-0 site of the Worldwide LHC Computing Grid, from which all data produced by CMS will be distributed. This data will go to eleven Tier-1 sites, and then on to more than a hundred Tier-2 sites around the world. The WLCG is based on the major multi-science Grids EGEE and OSG .
Indiana's Purdue University is one of seven U.S. universities acting as Tier-2 sites, and will receive its CMS data from the US Tier-1 site at the Fermi National Accelerator Laboratory, outside Chicago.
With the clock ticking, university staff are now poised for the data-crunching dash of their lives.
"Like an exercise session getting you ready for the big game, we've been going to the physics gym," says Thomas Hacker, a research assistant professor in Purdue University's Discovery Park Cyber Center and also with Information Technology at Purdue. "Everyone is building and testing systems to make sure the computing infrastructure is ready when the detector comes online."
Frank Würthwein is professor of physics at the University of California, San Diego, another U.S. Tier-2 site. He says Tier-2 sites will play an integral role.
"The actual data analysis by physicists will take place at Tier-2 sites, so it's important that we can receive whatever data our physicists need," Würthwein says.
"We will take data from CERN and push it across the worldwide networks to these seven places. They will receive it, analyze it, the whole gimbang. Once we have the data in all these places, a physicist will be able to submit jobs from their office computer, or even from a laptop in Starbucks."
In tests so far, the CMS Tier-2 sites have been able to support up to 50,000 jobs per day, and the goal is to be able to support 100,000 computing jobs per day by late spring.
Grid computing is essential to the success of the project, Hacker says. More than a hundred institutions are involved in the global effort, with the Open Science Grid providing the U.S. contribution, including much of the middleware used by the Tier-2 sites.
Purdue and UC-San Diego are the only two Tier-2 sites connected to the National Science Foundation's TeraGrid research network.
Indiana University is also preparing to set some high-speed data-crunching records, playing a key role in the ATLAS project, which, like the CMS project, aims to provide new insight into subatomic physics and the nature of matter.
"Together, the two state universities in Indiana are playing a key role in experimental physics," Hacker says. "Because of its science grid connections and computational resources, the state of Indiana is helping to lead the way at the frontier of science."
- Steve Tally, Purdue University