• Subscribe

Blocking COVID-19 infection with AI

Speed read
  • We’re still learning how SARS-CoV-2 interacts with cells in the body
  • Understanding how the virus enters our cells requires modeling and simulation
  • AI and machine learning are vital in efficiently creating these tools

Rick Stevens, associate laboratory director at Argonne National Laboratory, gave a presentation at ISC High Performance 2020 about how high-performance computing (HPC) is being used to fight back against the SARS-CoV-2 virus. In particular, he discussed how machine learning and artificial intelligence (AI) are instrumental to these studies.  

<strong>With over 25 million COVID-19 cases</strong> worldwide, and rising everyday, scientists continue to search for molecules with the best chance of preventing infection. Courtesy MTA New York. <a href='https://creativecommons.org/licenses/by/2.0/'>(CC BY 2.0)</a>At this point, we’ve heard just about every COVID-19 infection story there is. A woman in China took a single elevator ride that resulted in 71 infections. Some evidence suggests that as few as 10% of infected people cause 80% of new infections. There’s no doubt that this virus can spread quickly and efficiently, but we’re still figuring out how it works—and how we can stop it. 

One way to learn about the SARS-CoV-2 virus is through computational modeling and simulation. Understanding how the virus interacts with human host cells will allow us to discover viable treatments.

Working as part of the COVID-19 HPC Consortium, Stevens and other scientists around the world are trying to model the virus as accurately and as quickly as possible—and they’re counting on artificial intelligence to give them an edge.

A key in a lock

According to Stevens, one of the goals of both simulation and modeling has to do with how SARS-CoV-2 infects the human host. Our cells have a receptor for Angiotensin-converting enzyme 2 (ACE2), which normally helps us lower our blood pressure. But the ACE2 receptor on the cell’s surface also serves as an entry point for many viruses, including SARS-CoV-2. Scientists want to model that entry.

<strong>Protein spikes on the SARS-CoV-2 virus</strong> attach to human cells via the ACE2 receptor. If scientists can model the whole virus and its interaction with ACE2, they can then develop therapies for blocking infection. Courtesy David Orr. Picture what happens at the cellular level: The SARS-CoV-2 enters your body and finds itself near an ACE2 receptor on one of your cells. A spike protein wriggling on the exterior of the virus finds the ACE2 receptor and connects with it like a key opening a lock.

If scientists can model the whole virus and its interaction with the ACE2 receptor, they can then develop therapies for blocking infection. In the terms of our analogy, they want to learn how to stick some gum on that wriggling spike protein key so it no longer fits in the ACE2 receptor lock.

“The questions you’re trying to answer with these methods are: Do these proteins have multiple states? Are the states stable? And are they good states for drugging?” says Stevens. “That is, can you get small molecules that can bind to these proteins in these different states and basically shut down their functions.”

Traditional modeling and simulation on even the fastest supercomputer in the world would take too much time. And with over 25.2 million COVID-19 cases worldwide and growing, time is something we don’t have. But AI and machine learning can help make giant leaps in our understanding in a much shorter timeframe.

Outsourcing the thinking

Imagine AI as the means of having a computer do your thinking for you. Stevens explains that there are about 4 billion molecules that could possibly be used to gum up the SARS-CoV-2 spike protein’s key. For each of those molecules, there are 100 possible binding sites on the protein to try. Simulating each of those 400 billion options individually would take too much time.

The more efficient alternative is to train an AI to do all that work for you.

“You take a sample of a million of these small molecules, and you do a molecular interaction-level simulation of how the protein might bind to that molecule. And then you use that data to train an AI,” says Stevens. “So we build something like 100 AIs, each trained on a subset of the larger data and how well the model predicts it docks, and then use those AI models to search that much larger space.”

  • Graph: Sometimes known as a fingerprint, these are abstract patterns that represent the structural formula of a molecule.
  • Descriptors: Numerical properties of a molecule. Each molecule has about 5,000 specific descriptors to work with.
  • Image: Computationally derived images of the molecule. Just like an experienced chemist, computer vision methods are extremely good at recognizing aspects of a molecule that affect its binding.

In order for the AI to learn to effectively analyze this dataset, it needs ways to represent the molecules. The consortium is currently working with three main types of representation: graph, descriptors, and image.

Of course, using AI in this way is a lot easier said than done. Stevens and his colleagues have been working on this since the early spring of 2020. The database they’ve created already contains nearly 80 terabytes of data—and that’s not everything they wanted to include.

But they are making progress. Stevens reports that the AI programs can now search 4 billion molecules for a given target in just two days, clearing the way for the team to move onto the more laborious docking simulations. Once that step is complete, they will move onto free energy simulations at the atomic level, where they can get a much more accurate understanding of potential treatments.

When asked how he’s been dealing with researching a virus that has swallowed large portions of everyday life, Stevens is both realistic and optimistic.

“Well, it’s exhausting,” says Stevens. “But it’s also giving me a reason to get up every day. I’ve been probably working 12, 14, 15-hour days on this ever since the end of February. We’ve got hundreds of people working on this across the country and in Europe and other places, and that really makes it easier to bear. It’s good to be able to contribute to something that’s causing so much trouble.”

Read more:


Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2020 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.