• Subscribe

Jetstream makes HPC a breeze

Speed read
  • NSF adds cloud-based interface to the US cyberinfrastructure.
  • On-demand HPC access given to more scientists than ever before.
  • Virtualization enhances scientific reproducibility.

Forming a boundary between two large air masses, a jet stream is a rapid, high-altitude current that diffuses hot and cold air, guiding the winds that affect our daily weather below.

Crowd in the cloud. Cloud-based virtualization means more scientists will be able to bring their research into the high-performance computing era. Courtesy UITS at Indiana University.

This is the analogy behind Jetstream, a self-service cloud-based interface to US National Science Foundation (NSF) high-performance computing resources. Like its namesake, Jetstream will occupy the border between current and new users of NSF supercomputers.

“We're particularly focused on the ‘long tail of science’, the very large amount of data collected by lots of scientists doing field research, lab research, and scientific experiments,” says Craig Stewart, Jetstream principal investigator. “We aim to accelerate their research and improve their research education activities by several thousand people.”<b>Leader of the pack.</b> Craig Stewart is principal investigator of the Jetstream project, a NSF-funded project providing cloud-based, on-demand access to the nation's strongest computers. Courtesy Indiana University.

Funded by the NSF, Jetstream results from a partnership between Indiana University’s Pervasive Technology Institute, the University of Texas at Austin’s Texas Advanced Computing Center (TACC), the Computation Institute at the University of Chicago, the iPlant Collaborative at the University of Arizona, and the University of Texas, San Antonio.

To broaden access to HPC resources, Jetstream will run much like a traditional cloud facility, geographically distributed across the US and available 24/7/365. Web-based accessibility means researchers who have graduated to the need for supercomputing but lack technical sophistication find an easy route to harness the massive compute power scientific discovery now demands.

Managed through the XSEDE Resource Allocation System, the computing environment will consist of one cluster at Indiana University and one cluster at TACC with a test environment at the University of Arizona. The system will provide over a half petaFLOPS of computational capacity and 2 petabytes of block and object storage. The individual nodes will contain two Intel ‘Haswell’ processors, 128 GB of RAM, 2 terabytes of local storage, and 10 gigabit Ethernet networking. The system will leverage 40 gigabit Ethernet for network aggregation and each of the production clusters will connect to Internet2 at 100 Gbps. Geographic distribution allows redundancy and resilience if one of the sites becomes inoperable.<b>Covering the field.</b> Jetstream builds in resilience through geographic distribution. The computing environment consists of one cluster at Indiana University and one cluster at the Texas Advanced Computing Center (TACC) at the University of Texas with a test environment at the University of Arizona. Courtesy Indiana University.

“The most important part of Jetstream is its usability,” says Matt Vaughn, Jetstream co-principal investigator and director of Life Sciences Computing at TACC. “Researchers will be able to log in to the system, choose a virtual machine (VM) image that has the software and the environment they need, launch that virtual machine and be up and productive, analyzing and interpreting their data in a matter of minutes.”

Jetstream’s VM ability will also allow for the creation of an analyses archive, enabling a level of reproducibility previously underdeveloped in HPC-powered science. “With Jetstream, now you can capture the exact environment necessary and reproduce the analysis that may support a publication,” says James Taylor, Ralph O’Connor associate professor of biology and computer science at John Hopkins University. “We’ll be able to give a digital object identification (DOI) number to VMs so that they can become citable. This will take reproducibility of computationally intensive science to a whole new level.”

A new wind is streaming across the US scientific landscape. As the cloud-based component in the US cyberinfrastructure, Jetstream promises to democratize access to NSF-managed compute resources. By bringing supercomputing to more scientists than ever, the NSF has accelerated the pace of discovery.

Join the conversation

Do you have story ideas or something to contribute?
Let us know!

Copyright © 2017 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.

Republish

We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.