- Researchers in computer science and education want to use the cloud, but may find barriers to entry
- CloudBank portal combines cloud time with account management, outreach, and training
- First NSF-funded public-private partnership for cloud use is pilot project for future collaborations
Maybe you’ve heard that the cloud is a convenient way to boost your computing resources and shorten time to results. But what if you’ve never used the cloud before?
You might wonder how exactly to go about it. How to choose between different services. How to write cloud resources into a grant proposal. And once you’re in the cloud, how do you track usage and control costs?
“The cloud is often touted as ‘easy’ to use, but there’s a lot of complexity,” says Shawn Strande, deputy director of the San Diego Supercomputer Center (SDSC). “There are a lot of options and choices and toolsets. So while it’s easy, it’s also sometimes difficult to navigate.”
CloudBank, a new public-private partnership funded by the National Science Foundation (NSF), hopes to smooth that navigation. Accessible through a portal that provides a suite of services to simplify cloud access for research and education, CloudBank will help researchers take full advantage of cloud resources. Initially focused on computer science and education, as the project progresses, CloudBank will explore expanding to the domain sciences.
Easing the pain
“The portal provides an on-ramp to help researchers overcome challenges such as managing cost, translating and upgrading computing environments to the cloud, and learning about cloud-based technologies,” says Michael Norman, SDSC director and PI for the project.
Researchers will initially have access to Amazon AWS, Google GCP, and Microsoft Azure through Cloudbank, though others may be added later. But the bulk of the NSF’s $5 million grant supports development of the portal, and providing outreach and training to help researchers get the most out of the cloud.
The public cloud offers an array of different compute resources, data, software, and networking options, with a myriad of charges for each option. Multiply that by the number of cloud providers in the market, and it’s just not easy to make sense of, says Strande.
Once a project is running in the cloud, a researcher still needs to understand what they’re being charged for, how to optimize workflows to avoid overcharges, and how to keep track of it all.
That’s really different from your typical campus HPC or data center that has a fairly narrow range of things you can do. Now multiply that by hundreds for the complexity of the cloud and there are just so many ways for PIs to get into difficulty. ~ Shawn Strande, SDSC
A collaboration between the University of California San Diego (UC San Diego), the University of Washington’s (UW) eScience Institute, and UC Berkeley’s Division of Data Science and Information, each institution in the CloudBank project is responsible for an area that plays to their strengths.
SDSC is building the online portal, and UC San Diego will handle management of the cloud accounts.
“We’re handling all of the minutiae of how money moves around between NSF, the cloud providers, and the PIs,” says Shava Smallen, an SDSC programmer/analyst and CloudBank co-PI. “Once their accounts are set up, a research team can use the CloudBank portal to look at their burn rates, track and monitor their use of the resource, avoid unexpected charges, and ultimately ensure the most efficient use of their cloud allocation.”
By leveraging the University of California’s existing discounts with cloud providers and combining that with bulk buys of cloud time through Strategic Blue, a company that optimizes procurement of cloud resources, CloudBank expects to realize significant cost savings.
Those savings result in ‘free energy’ that can come back to CloudBank and be used for more research, and more education, outreach, and training, according to Strande. “If through these purchase agreements we get more resources than the dollar value on the open market, it’s going to come back to benefit CloudBank and the researchers that are using it.”
But before those benefits can be realized, researchers have to understand their options. Ed Lazowska, founding director of the UW eScience Institute, is concerned that despite the suitability of the cloud for research computing, adoption is still behind where it should be.
Increasingly, the public cloud is the right place for research computing: it’s infinitely scalable, it evolves continually in both hardware and software capabilities, collaboration and sharing are easy, and there are no capital, space, power, cooler, or operating costs. But adoption has lagged. ~Ed Lazowska, University of Washington.
That’s why UW is taking on the task of training and outreach. They will be explaining the benefits of the cloud for different kinds of research and helping to clarify what resources are available.
One way to do that is to get tomorrow’s researchers into the cloud today. David Culler of the Berkeley Institute for Data Science will lead the development of tools for education using the cloud.
All of the participants are excited about the possibilities of this new way of working and doing research. Interest in cloud services is accelerating, and CloudBank is the first project using the public cloud for academic research in a coordinated way.
“We’ve certainly seen a significant increase in researchers using cloud; it’s been doubling over the last couple of years,” says Vince Kellen, CIO for UC San Diego. “And that’s still early stages. We expect continued growth for at least the next 5-6 years.”
CloudBank is a 5-year project that kicked off on August 1, 2019. The first year is all about development of the portal and the workflows associated with processing proposals, allocating awards, working with the cloud providers, streamlining the accounting, and developing the tools and outreach materials to help PIs.
In a little less than one year, the portal will be open to early users. It offers an opportunity to study a new model of public-private partnership in scientific enterprise.
“We’re trying to lower the barrier to entry,” says Smallen. “There’s just tons of software and hardware available on the cloud, and Cloudbank will make it easier to get on there and start using it in their research. That’s about broadening impact.”