Feature - Transferring FTP to the cloud: Off of desktop, out of mind

Feature - Transferring FTP to the cloud: Off of desktop, out of mind

Kettimuthu's team chose the 10 terabyte data set shown in this image for the Bandwidth Challenge. The data, which comes from the World Climate Research Program Coupled Model Intercomparison Project, simulates temperature change at the Earth's surface and zonally-averaged throughout the atmosphere from 1900-2100. Image courtesy of Rajkumar Kettimuthu.

Don't shut down. Don't reboot. Don't disconnect. And don't even think about closing the window. Securely and rapidly transferring large amounts of data with GridFTP comes with a lot of "don'ts."

That's why the Globus Alliance team led by Steve Tuecke, a researcher at Argonne National Laboratory and the University of Chicago, decided to create a hosted data movement service dubbed Globus.org.

Take a 10 terabyte transfer using GridFTP as an example. "On a typical network it takes about two days," said Rajkumar Kettimuthu, the Argonne National Laboratory researcher in charge of the GridFTP project. That's two days during which users can't shut down, stall, lose their internet connection, or accidentally close the GridFTP window without interrupting the transfer.

Exisiting GridFTP clients aren't necessarily a simple one or two-click install, either. "If you want the security on the data channel then you need to get the security certificates, which is not a trivial process," Kettimuthu said.

In contrast, with Globus.org users can transfer data between any two computers without the need to install, run, or monitor the process. While the transfer is taking place they can shut down, reboot, disconnect, or close the connection to Globus.org without causing any problems. In short, they can go back to their business - when the transfer is complete, Globus.org will notify them via email.

This diagram attempts to visualize how Globus.org will function. Image courtesy of Globus.org.

"Users just want this to work simply, reliably, securely, and repeatably," Tuecke said. "The purpose of Globus.org is to provide an easy-to-use service to scientific users."

Preparations for the annual Bandwidth Challenge at Supercomputing 2009 offered the team a good opportunity to push forward the development of Globus.org. During the Bandwidth Challenge, teams from around the world competed to show how much data they could transmit from multiple locations to the conference in a fixed period of time.

In preparation for the challenge, the team copied their 10 terabyte data set, already stored in two other locations, to servers at Argonne. That transfer presented an excellent stress test opportunity for Globus.org, which the team used to performance-tune the software and fix problems.

"As a result of this effort," said Kettimuthu, "the team members have a better understanding of the infrastructure and configuration options. This understanding-coupled with the new features the team created-will help us make Globus.org robust when it goes into production this coming March and will benefit all scientists who use it."

-Miriam Boon, iSGTW. With additional reporting from Linda Vu, LBL, and Anne Heavey, Fermilab.