- Computational systems biologist Dan Jacobson thrives on interdisciplinary collaboration
- Explainable AI helps scientist understand the interactions captured by their algorithms
- Titan and Summit supercomputers contribute to research in bioenergy and the genetic factors in human health
Dan Jacobson is illuminating the workings of biological systems from the molecular scale up. He leverages Oak Ridge National Laboratory’s (ORNL) supercomputing resources to create deep-learning techniques more easily understood by humans -- an evolving field called explainable artificial intelligence (AI).
Understanding how complex interactions between genes and proteins in cells influence an organism’s traits and behaviors drives Jacobson’s work as a computational systems biologist. His team integrates big data, supercomputing, mathematics, statistics, and biology to study questions in bioenergy, microbial systems, neuroscience, and precision medicine.
Jacobson’s work in bioenergy builds on a long-standing research initiative focused on Populus, a fast-growing perennial tree that shows promise as a low-cost, renewable feedstock for bioenergy. A large team of researchers collected data on 28 million genetic variations in Populus as part of the BioEnergy Science Center, with the work now continuing at the Center for Bioenergy Innovation at ORNL.
With this wealth of data and a grant from the Department of Energy’s (DOE) Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, Jacobson and collaborators are creating an unprecedented view of the 3D interactions among components of the cellular machinery in poplar.
“We’re taking these very large and disparate data sets and integrating them into a whole that nobody really projected,” said Jacobson.
Predicting how genes, proteins, and small molecules interact inside a cell requires new approaches to mining big data with high-performance computing. Using explainable AI, researchers are starting to discover the high-order interactions that their algorithms capture in order to make the results understandable by humans.
Jacobson uses explainable AI to discover molecular interactions in biological systems that lead to the emergent properties of the organism (physical characteristics, diseases, etc.), but he notes that the method can be adapted and applied to other projects.
“No one complains when you poke a tree,” says Jacobson. “You can learn a lot developing algorithms for a better understanding of plants for bioenergy. We can pivot and apply that knowledge toward projects that are going to positively impact human health. The algorithm doesn’t care about the species.”
Big data, big science
Examining how genetics affects health is the aim of a collaborative project Jacobson is contributing to with support from the US Department of Veterans Affairs and DOE. The project combines genetic, clinical, and lifestyle data with the ultimate goal of predicting and diagnosing diseases and tailoring treatments for individuals.
As one of the largest clinical genomics projects in the world, the initiative challenges scientists to use high performance computing and millions of veterans’ health data to understand the complex genetic underpinnings that affect medical disorders, drug interactions, drug specificity, and individuals’ responses to pharmaceuticals.
Jacobson’s interest in the veterans project is both professional and personal as his grandfather, father, brother, and nephew have served in the U.S. armed forces.
Collaborative networks sharing data are essential to modern, large-scale biology, and Jacobson pulls data from a wide variety of sources around the globe.
Jacobson has also been selected to lead an Early Science project on Summit, ORNL’s new supercomputer, focused on human systems biology and drug discovery.
It was the big data housed at ORNL combined with some of the fastest supercomputers in the world that drew Jacobson to return home to Oak Ridge where he spent his childhood.
From vineyards to bioenergy feedstocks
Jacobson grew up in a scientific household with his father, Bruce Jacobson, a biochemical geneticist at ORNL, as his role model. They even had the opportunity to collaborate during the early days of the Human Genome Project.
When Jacobson and his father jointly presented their research at a conference, they often flew from different parts of the world to do so.
As the leader of a computational biology group at the Institute for Wine Biotechnology at Stellenbosch University in South Africa, Jacobson studied the grapevine phytobiome, including the plants, soil, environment, and associated microbial communities.
He and his colleagues made fundamental discoveries that helped explain variations in wines made from the same cultivar grown in contiguous vineyards but using different farming practices (traditional, organic and biodynamic).
“It was a really convenient excuse to do systems biology out in the field,” says Jacobson. “It is one of the few—if any—scientific environments where you can drink the results of some of your experiments.”
It turns out that research into wines and grapevines has many parallels in bioenergy applications. In both cases, scientists look at similar questions about the growth of biomass and fermentation processes that follow.
This background has also enabled Jacobson to make significant contributions to the ORNL Plant Microbe Interfaces Project that examines the fundamental relationships among plants and microbes to facilitate an understanding of sustainable systems and the use of renewable feedstocks for bioenergy.
Through all the work environments he has experienced and enjoyed, Jacobson had been subconsciously missing the interdisciplinary nature of a national lab, he admits.
“National labs think at larger scales,” said Jacobson. “My group and I are driven by big data, big science, and exciting opportunities. We’ve felt right at home here.”