- Genomic analysis of SARS-CoV-2 virus reveals timeline for Arizona’s infection
- First case was successfully contained, but other strains later broke out
- Many of Arizona's early cases can be traced back to Washington state, New York, and Europe
Can you remember January of this year? It was a time where large gatherings were fun, “social distancing” was an unfamiliar term, and seeing someone with a mask on in a bank was terror-inducing.
It was also when Arizona had its first confirmed case of SARS-CoV-2 infection.
That early case may not be a big surprise if you know how severely COVID-19 hit Arizona this summer, but you’d be wrong to connect the two events. That first case was contained, and virologist Dr. Efrem Lim says that it spawned no known infections.
An assistant professor at Arizona State University’s (ASU) Biodesign Center for Fundamental and Applied Microbiomics, Lim worked with a team of researchers to create a genomic study of the SARS-CoV-2 strains present in Arizona. They then created a timeline for the state’s infections and found that the explosion in COVID cases can be traced back to out-of-state incursions of the virus.
Early successes and early failures
Arizona played host to the fourth known case of COVID-19 in the United States. The anonymous patient was a member of the ASU community who had recently visited China’s Hubei province. They were aware of the potential for infection and knew to isolate themselves and inform authorities when they showed symptoms. Lim has found no evidence that this patient spread the virus to any other person.
The reason Lim can be sure is thanks to a genomic study of the different strains of SARS-CoV-2 that spread in Arizona. The researchers sequenced 79 nearly complete SARS-CoV-2 genomes collected from patients across the state over a 28-day period between March 5 and April 2, 2020.
By studying SARS-CoV-2’s RNA, the scientists were able to create a phylogenetic tree that traced different strains back to their region of origin. That let them nail down the timeframe for Arizona’s explosion of COVID cases to, at the earliest, sometime in early-to-mid February.
Their phylogenetic study examines genomic bases. The bases of any RNA strand are denoted as adenine, guanine, cytosine, uracil – each of which are more commonly referred to as A, G, C, U. Tracing differences within these bases allows us to better understand what each unique strain looks like and, hopefully, where it comes from.
“We could trace them to three major areas,” says Lim. “One [strain] was from Washington state…one of the early outbreak sites in the United States. Another from New York, which was around March or so. A lot of Arizona isolates came from that, likely associated with travel from New York. A third source was from Europe.”
SARS-CoV-2 is a completely new virus, and there’s still a lot to learn about it. Something that puzzles scientists is why this virus is so large. Twenty-nine thousand bases is a big genome for an RNA virus, and we aren’t entirely sure how that happened.
Lim states that SARS-CoV-2 has a protein that acts as a proofreader that can “edit” out mutations during replication. This lack of “mistakes” might be the reason why SARS-CoV-2 can tolerate a larger genome, but it’s too soon to be certain.
“Personally, I've lived through the 2003 (SARS) outbreak and I knew how terrifying and devastating that was. That’s why I moved my lab towards this direction.” ~Efrem Lim
One thing we do know is that this huge genome makes studying the virus difficult. As Lim points out, HPC was absolutely crucial in the team’s work.
“One virus has 29,900 base pairs of the genetic material,” says Lim. “Each base is an information datapoint. Now multiply that by the thousands of genomes of these viruses around the world. You can tell quickly that is a big computational problem.
“We absolutely had to go to our high-performance computing cluster here at ASU to give us the bandwidth to take all that information and figure out the evolution of where the virus has been coming from,” he says. “You could do that on your MacBook, I guess. But with that one computing chip it’s going to take you hundreds of thousands of hours.”
This is an important project for Lim, who was in Singapore in 2003 when the first SARS outbreak hit the world. Back then, researchers didn’t have the sequencing technologies that we have today, and it took many months before experts even knew why people were getting sick.
“Personally, I've lived through the 2003 outbreak and I knew how terrifying and devastating that was,” says Lim. “That’s why I moved my lab towards this direction. We had the expertise, we've been sequencing viruses, discovering new viruses, and studying the evolution of viruses for a long time. Everyone had their own projects but they put them aside and stepped up. We said, ‘Here is an important problem, and we can contribute to this.’”