Dr. Dalia A. Conde is quite a busy person. She’s an associate professor at the University of Southern Denmark, she’s worked in the field putting radio collars on jaguars, and she’s a member of Women in HPC.
She’s also the director of science at Species360, an organization dedicated to facilitating collaboration in the sharing of wildlife data. In her work, she’s found that we have huge gaps in our knowledge about the animals we share Earth with.
To help solve this, Conde and other collaborators are working on creating the Species Knowledge Index. This index is meant to show exactly what information we are lacking, as well as consolidate what we do know in areas like demography, genetics, and even relevant legislation data. The first of these indexes, called the Demographic Species Knowledge Index, revealed we’re lacking important data for 98 percent of known species.
Following a talk at the SC20 conference, we sat down with Conde to learn more about how filling these knowledge gaps could help us deal with an extinction crisis.
Can you briefly describe what's happening with the current mass extinction?
Extinction is part of evolution. You have what they call the arms race between predators and prey. Of course, there are species that become extinct sometimes not because they disappear, but because they diverged into two species. And of course, we have some branches of the tree of life that are totally eliminated.
It’s very important to stress that the problem is not extinction. The problem is the rate at which we're seeing extinction.
What scientists have measured is that what’s happening right now is equivalent to a mass extinction. The extinction rate for vertebrates is 100 times higher than what you would expect under normal circumstances. This has been discovered based on the fossil record.
In the fossil record, we have seen big drops in biodiversity. And these big drops in biodiversity are seen after changes in climate – elevation of oceans and changes in the atmosphere like high levels of methane. In the record, we have seen all these drops in biodiversity up to 50 percent or more. There was one drop that was around a 95 percent drop in biodiversity on the planet.
It didn't matter how well the species adapted. The changes were so fast in geological time for a species that it didn't matter. These mass extinctions were crisis for those species that became extinct but it also opened new opportunities for new species. For this current mass extinction we can say “well, it's not important for the planet. The planet already survived five mass extinctions.”
It’s important for us if we want to continue on this planet. We have co-evolved and we depend on all the species that we're killing. If you think about natural resources, the air we breathe and the water we drink is part of the biodiversity complex.
"By removing these species, many countries are considered to be on the verge of ecosystem collapse. And that’s where we are, in the sixth mass extinction." ~ Conde
We need all these plants to capture that water in our aquifer. Think about the oxygen we breathe, that is another interaction with these plants. We think about pollinators of our crops. By removing these species, many countries are considered to be on the verge of ecosystem collapse. And that’s where we are, in the sixth mass extinction.
You mentioned in your talk, 60 percent of all human infectious diseases are transmitted from animals. Can you expand on that?
With domestic animals, there are these diseases that they transmit to us. We're used to those types of diseases more or less. We have vaccinations, and we have the knowledge because we have had domestic animals for many years. So that 60 percent includes domestic animals.
If we take that 60 percent of the diseases that come from animals, between 70-75 percent come from wildlife. Not from pigs and not from cows, but from wildlife – from bats and snakes and all these other animals. And we have not been in contact with these species for a long time. So of course, our immune system is totally naive to those new diseases.
For example, take the Hendra virus. The Hendra virus appeared in 1994 in Australia. It was transmitted from horses to a human. And this human suddenly died. What they found out after many years is the Hendra virus came from a fruit bat which went on to infect a horse. This horse transmitted it to a human.
Why was it not a kangaroo? Well, because kangaroos have co-evolved with these bats for a very long time. And horses had been in Australia for only a few hundreds of years. The immune system of this horse was totally naive to the virus from the fruit bat. And of course, this horse acted as the amplifier for the effects of the virus.
"The extinction crises is so bad that we can’t wait until we have all the data." ~ Conde
The emergences of these infectious diseases are from very complex relationships. We have seen it with Ebola. Ebola was in primates, it started spreading in Africa. Now it's quite well controlled. HIV as well, now we know where it comes from.
What worries me is that while it is very important to develop vaccines, we also need to do preventive work. It's like preventive medicine. You can try to do all these things with diabetes or obesity, but if we don't have a preventive treatment we're going to have health problems. Its the same with these pandemics. If we don't stop deforestation and habitat fragmentation, all these issues will continue.
How was the Species Knowledge Index started?
I’ll be honest, it started with a very big grant rejection. I was at Max Planck Institute for Demographic Research in Germany. There, one of the goals is to understand why humans have expanded our lifespan six hours per day since the 1800s. Are we going to continue expanding forever? Why have we pushed the boundaries? To find out, you go back to evolution and what's happening with other species.
We wanted to understand that, and I’ve always had a passion for conservation. So when I moved to Max Planck, my idea was to use all these evolutionary analyses on demography, which is birth rates and mortality rates. They were developing top notch models on biodemography analyses. I wanted to do that. And that is when I discovered the data is zoos and aquariums.
I'm director of Species360, a non-profit that has a huge real-time database of more than 1,245 zoos and aquariums all over the world that record for every individual under their care. When they’re born, when they died there, their blood samples, their body weight. All this is shared across 100 countries. We have 21,000 species and 800,000,000 medical records. They're getting updated all the time, because we have 16,000 users all over the world.
I discovered this and I said “we can use this data to understand demography, birth rates, and mortality rates. Even if they're captive populations, they can bring a lot of information for wild populations.”
I'm a biologist. I used to work in the jungle, catching jaguars and giving them radio collars. And yes, this was really exciting, but I got very little data. So, a lot of the stuff that I used for my thesis was by talking with people who knew about wild jaguars, but also talking with people that manage jaguars in zoos.
I wrote this grant to say we can do analytics to understand the difference between the wild and captive animals and see for which species we can use these data. And the response was like what I was proposing was the most absurd thing because animals in the wild do not behave like animals in captivity. They thought this was totally stupid. There was already enough knowledge. And I was like, “no, we don't have any knowledge!”
One day I think, what if we could map the knowledge for every species on the planet? We could have a visualization and a species knowledge index for every species. This is when I decided to do the Demographic Species Knowledge Index. Of course, that is not me alone. I gathered a bunch of collaborators.
We started with this knowledge index, but then we said “yes, we know that demography, but we need to know the genetics. We need to know about policy and what about this and what about that.” This is when we started to do all these layers of knowledge. That is what I used to do with jaguars.
You have the geology, you have the roads, you have the vegetation, you have the human development, and you collapse all those layers. Then you can understand how jaguars move or how jaguars interact with all these different knowledge areas. We decided to do the same approach for every layer of knowledge of a species.
We're expanding all these areas of knowledge. I'm very excited that the policymakers are already immediately seeing how well they can apply this knowledge. For me, that's the most exciting thing. Not to publish another paper. This could have an impact on changing global policies and change how we prioritize actions.
With this demographic species knowledge index, how is the information gathered?
That process is very rudimentary right now. We want to write a grant to develop a formal digital infrastructure to do this. Currently, we don't gather data. We go and we find the data repositories from people that already gathered all this data and organized it. So, first we identify those.
"We have 21,000 species and 800,000,000 medical records. They're getting updated all the time, because we have 16,000 users all over the world." ~ Conde
In the case of amphibians and reptiles, where there was nothing, then we go to the literature and we gather information in the literature. And then we have experts. We have experts on demography of mammals, experts on demography of birds, reptiles, and amphibians.
With these knowledge experts, we try to define which are the best repositories to go to. With the knowledge experts, we had a workshop where we defined how we translate all this data we have and how we translate all the terminology into a united terminology. Each database uses different taxonomy, which means that the species name is often different. That's the biggest nightmare.
And they use different language. So we have to create these standards across databases, and that's something that you have to do with humans. You cannot put just an AI algorithm to do it. You need expert humans.
Once we do that, then we write the code to digest the data and integrate all these databases. And then we get an output in which we say “okay, these are the species. We have all this knowledge and this is how we measure it.”
Our process right now is a huge problem because everything changes. Once we have the index, in two years it’s obsolete because there are new publications, people find new data, et cetera. The dream is to retrieve this data on a yearly basis. These databases are not like our database in zoos and aquariums where every day there is new information. The data usually is updated every year.
How was HPC used in this work?
We are using HPC to derive all the information from the zoos to fill the knowledge gaps. For the databases I was talking about before, you don't really need HPC. But we need HPC when we extract all the data from zoos.
Then, we have to run all the analytics on survival across all these species. But we need to check the parametric space because every species has different survival and fertility trajectories. To model that, we need a lot of computer power. Its hundreds of species with millions of records and many different types of modeling.
Of course, we will need HPC to handle what is coming in the future. We need to analyze all these different data and relationships. We need to do analytics. And when you have so much data, you need to rely on HPC.
What are your hopes for this project? What excites you?
Well, what excites me is we have today the technology to take this to the next level. Right now, we're digging in all the open data repositories in the English language. But if we can go into the data repositories and get all the articles in Chinese and Russian and Spanish and French, we'll get a lot of information. A lot of this information is in small journals, and that data is super relevant for us. That's in the case of demography.
But in the case of other information on the illegal trade – there’s a lot of data coming from all these illegal internet dealings. We could harness a lot of the information from confiscations and seizures that are in these databases. Nobody's analyzing them.
I think this index will help not only policymakers, but will also help us to better research because we will be able to see the gaps in our knowledge. We will see the opportunities. Where do we need to invest research or whether we need to borrow data from one species to bring it to another. The extinction crises is so bad that we can’t wait until we have all the data.