iSGTW Feature - CMS data transfer

Feature - CMS readies network links for LHC data

CMS data transfer rate in MB/s during the data transer debugging task, from September through November 2007.
Image courtesy of CMS.

Transfer 300 thousand million bytes (300 Gigabytes) per day for six out of seven consecutive days and move a total of 2.3 Terabytes during that same seven-day period-that's 2.3 million million bytes-these are the criteria that each major network link in the Compact Muon Solenoid's computing structure must satisfy when the Large Hadron Collider turns on this summer. Together they will transfer tens of Terabytes a day.

The CMS computing structure comprises an internationally distributed system of services and resources that interact over the network through the Worldwide LHC Computing Grid (WLCG).

The three-tiered architecture, fetchingly illustrated in a Flash animation from the GridCafe, functions as a single coherent system. Data collected at CERN, the top tier (dubbed Tier-0), is distributed to a set of seven Tier-1 sites in as many countries. Tier-1s host large-scale, centrally organized processing activities and in turn distribute data to smaller Tier-2 centers. The 45 Tier-2s provide capacity for analysis, calibration studies and simulation, and are available to the entire CMS collaboration through the grid.

Last July CMS assembled a task force to ready the inter- and intra-tier networks.

"As this task started, the networking infrastructure was largely in place, but most of the data transfer links didn't work," says James Letts, project scientist at the University of California, San Diego.

The task force ran a set of simulated datasets up the computing chain to CERN, and ran workflows based on these datasets back down to the Tier-1 and Tier-2 centers.

Thanks to a tool developed by Brian Bockelman of the University of Nebraska and CERN summer student Sander S├Ánajalg, the team extracted transfer volume data from PheDEX, CMS's data transfer middleware, and applied the stringent commissioning criteria to it.

"In the transfer tests in February 2008 we were able to more than double the throughput over all links seen in the 2007 CSA07 tests," said Letts. "At this point (May 2008) all the Tier-1 sites are stable and about two-thirds of the Tier-1 to Tier-2 connections have been commissioned."

CMS continues to push hard towards a stable and well-performing system for the start of data taking.

"The ability to move and analyze data is essential to any experiment, and so far the data transfer system in CMS seems to be up to the challenge," said Letts.

Tier-1 (red) and Tier-2 (blue) sites worldwide in CMS.
Image courtesy of James Letts

- Anne Heavey