Shawn Brown joined the Pittsburgh Supercomputing Center (PSC) as its new director at the end of 2019. He has a background in computational chemistry and previously worked in public health, modeling responses during the H1N1 outbreak in 2009. Additional experience with biostatistics, neuroinformatics, and large-scale data sharing put him in a unique position for the current moment.
We caught up with him last week to find out more about his vision for the PSC and how its resources are being redirected to contribute to the global coronavirus response.
You’ve only had a few months to get settled in your new role at the PSC before a global pandemic hit and everything changed—and we’ll get to that in a minute. But what was your original vision for PSC?
We're not a center that's necessarily focused on putting down the biggest and the fastest supercomputer. Instead, we're trying to field technologies and platforms that bring together collaborators. We want to be a place that gets science done, that brings together scientists from all walks of life to be able to work together in a common data and computational environment to solve really grand challenge problems.
My vision is that the PSC is the place to come to collaborate in computational science. That's been informed by my previous public health work where we were building models to look at disease spread. It required us not just to talk to computer programmers, but clinicians and virologists, immunologists, computer scientists, geographers, and logisticians. A whole bevy of different people have to come together to build these kinds of models. I want the PSC to be that place where people come together to solve the really challenging problems that are facing us today.
It’s not always easy to get people to work together smoothly. How are you encouraging that?
From the technical side, it's about building platforms that remove the barriers to using high performance and advanced research computing by multiple communities. Instead of focusing only on the traditional communities that know how to use the command line and run code, it's about focusing on the other communities out there and the new ways of doing computational science, like the burgeoning artificial intelligence community. We’re bringing in new modalities like interactive computing notebooks and web-based platforms that make sense out of the complexity of computational data. As well as bringing data to the scientists, by putting things in a common repository that makes it very accessible.
We want to be a place that gets science done, that brings together scientists from all walks of life...to solve really grand challenge problems.
On the sociological side, it's about finding those groups of scientists that want to collaborate, but also making it a much more open environment. Working on the data sharing, making data findable, accessible, interoperable, and reproducible and building platforms that help people to discover and share new science and new analytic techniques.
At PSC we’ve always focused on the human support component of high-performance computing. We've never believed that if you just put down the hardware, they will come. We've always had a very strong consultation staff that has been working really closely with researchers to accomplish their science, and that's really, really important.
Now, speaking of “really challenging problems” the PSC wants to help solve, we seem to find ourselves in the middle of one right now. What does high-performance computing (HPC) bring to the solution?
It's a unique resource to allow us to do detailed modeling in real time that we've never been able to do before. Then combine that with the rise of AI, machine learning, and the availability of data gives us new techniques for forecasting and understanding where the disease is at and where it's going.
Back when I was doing this during H1N1, getting access to a GPU was not something you did. Now, there's GPUs all over the place, and it's accelerating the AI work, as well as things like molecular dynamics and drug screening. The ability for researchers to get access to large amounts of computing can help rapidly target these drugs and help us find therapeutics much, much, much more quickly than we could before.
What are some specific COVID-19 projects PSC is working on? Have you had to make changes to your normal workflows to accommodate them?
We were asked to help a group led by Madhav Marathe at the University of Virginia. They’re doing large-scale agent-based modeling of all the states in the US every night, informing federal agencies about the spread of COVID-19, healthcare resource requirements, and the effectiveness of social distancing policies. That came through an interaction with some federal organizations, then funneled through XSEDE. We’ve been working with them for about a month now, doing those simulations.
The ability for researchers to get access to large amounts of computing can help rapidly target...drugs and help us find therapeutics much, much, much more quickly than we could before.
The big thing that we've done is establish reservations and optimize job submissions so that they could run every night, updating their data, getting their models running in an efficient way on the resources so that they could get a full suite of runs and sensitivity analysis. They have to complete a lot of computing overnight, from about 10:00 PM to about 10:00 AM. We need to fit that all in there so that we're not totally disrupting the rest of the user community.
We’re also a part of the COVID19 HPC Consortium. From that we got a project looking at drug screening techniques for antivirals. The researcher, Olexandr Isayev at Carnegie Mellon, has unique AI approaches that he’s combining with computational chemistry to accelerate quantum mechanics calculations by six orders of magnitude. This raises the possibility of broadly applying quantum mechanical scoring to real-world drug discovery pipelines. He’s using that to try to create design pipelines that will be able to get more accurate screening of drugs that may be potential clients for antivirals.
We're also working on putting together datasets and data sharing. We've made available a dataset from Wuhan, China of the genomic and proteomic sequences from coronavirus patients. We're trying to make sure that we can position datasets that people might use for research so that they're there and available on Bridges and other platforms in our center.
Our job is to make sure researchers can get their work done with less fuss and muss.
All of these researchers have a lot going on, and our job is to make sure they can get their work done with less fuss and muss. We want to make sure they can get their research done so that they can go out and make sure that the government and organizations are doing what they need to do to combat the disease.
HPC is often the invisible workhorse in the background, crunching numbers and running simulations but often without getting a lot of public recognition. Has the coronavirus response changed that?
I would say so. It's always been important, but the scale at which we’re now working and the rise of AI algorithms and machine and deep learning presents new frontiers for us to be able to help combat the disease in ways we've never been able to before.
But what's really catalyzed that, too, is the availability of data. There's many different sharing initiatives out there that are getting it out in the open for all sorts of researchers to use. Everybody's looking for an answer, and these models are exactly what you need to do this. Being able to explore the ‘what if’ is exactly why they were developed. I think that's really helped to make computing front and center and necessary for the international response to this disease.
Seeing the response of the community mobilizing and making sure that the resources are dedicated to doing this kind of work is very, very heartening.
One thing that I've noticed in this pandemic versus the last pandemic of H1N1, is that computational modeling is all over the news, and it is clear that decision makers all the way up to the President are using those results to inform the policy and decisions.
Has anything else changed since H1N1 in 2009? Eleven years is a long time in the tech world.
HPC, and I would say a more appropriate term like advanced research computing, is much more data-centric now. Back then was sort of the advent of data-centric computing. But now it’s really in full force. The key to understanding where the disease is at and what we need to do is all data-driven. So, we're in a much better position now than we were then to do large-scale, complex analysis to help decision makers boil it down to where we're at and what decisions need to be made.
Do you personally feel encouraged to have all these computational resources and consortiums throwing their compute power into the fray?
Absolutely, I'm very encouraged. Seeing the response of the community mobilizing and making sure that the resources are dedicated to doing this kind of work is very, very heartening. That does give me hope that we will have better solutions quicker than we would if everybody was trying to scramble and do this independently.
We have the attention of all the major players in high-performance computing, both in the on-premises computing and the cloud support. They're not just dedicating cycles or storage but dedicating people to it as well to help researchers as much as possible scale up their science and get it done as quickly as possible to inform the response effort. So, yes, I think it's wonderful. I greatly applaud it.