Feature: Meeting the Data Transfer Challenge
As part of a computing challenge held in 2006, two university groups used parts of two grid infrastructures to transfer a massive amount of data. Over a one-month period, 105 terabytes of data was transferred between Purdue University in Indiana and the University of California, San Diego, using parts of the Open Science Grid and TeraGrid infrastructures.
Both university groups have computing sites that are part of the worldwide computing infrastructure for the CMS particle physics experiment. The challenge was part of preparation for start-up of the experiment at the end of 2007.
While all U.S. sites participating in the CMS computing infrastructure are part of the Open Science Grid, Purdue and the San Diego Supercomputer Center at UCSD are also members of the TeraGrid. Working with their respective TeraGrid sites, the two university groups combined the OSG infrastructure with a dedicated link set up through the TeraGrid network to push the limits of their network use.
The group reached peak data transfer rates of 200 megabytes per second, with bursts of over four gigabits per second. The model for the CMS experiment calls for data to be passed by the CMS detector at CERN in Switzerland, to a series of large computing sites around the world, and then to centers such as Purdue and UCSD, called "Tier-2" centers.
"The data transfer routes for CMS don't really include moving data from one Tier-2 to another, but it's a great way for us to learn how to better use the 10-gigabit links," says Preston Smith from Purdue University. And experience and advances made by any one of the CMS computing centers is shared with other centers in CMS, OSG and the TeraGrid, to the benefit of the larger grid community.
The CMS Tier-2 centers in the United States and around the world have more work yet to do on their network infrastructure before they're ready to accept the large data rates expected when the experiment starts running - up to 100 megabytes per second. The eventual goal for the computing sites during 2007 is to sustain the use of more than 50% of their network capacity for an entire day. For the Purdue-UCSD network link, that would mean sustaining transfers at approximately four gigabits per second for one day.
"We've shown that we can exceed any peak performance goal that we set for ourselves, so now I'd like to see more focus on sustained operations," says Frank Würthwein from UCSD. "I'm hoping in the near future we'll see a week of daily average transfers above 100 megabytes per second between UCSD and Purdue."
- Katie Yurkewicz, iSGTW Editor-in-Chief
Do you have story ideas or something to contribute? Let us know!
The full article is available here as HTML.
Press Ctrl-C to copy