Seeing the invisible history of leaves

Speed read

Leaf fossils dominate the fossil record, but identification lags.
Machine learning algorithm teaches computers to read cleared leaves.
Computer vision heat maps teach scientists about botanical history.

If you find a new dinosaur the next time you stick a shovel in the dirt, you’ll be famous. But pity the paleobotanists — they find new leaf fossils every time they dig.

<strong> Feel the heat.</strong> Leaf images from members of the Sapindaceae (Soapberry) family that includes maples, mahogany, horse-chestnuts, mangos, and cashews. Computers learned to identify families of leaves through heat mapping, a technique that identifies important characteristics of the angiosperm family. Courtesy Peter Wilf, et al.

Lack of fame is the least of their problems, though. A central obstacle botanists face is the inability to identify all those fossils. Leaves are naturally complex, with an astounding variety of vein and shape patterns. Comprehensive knowledge and identification are virtually impossible.

This roadblock spells an incomplete evolutionary history of green life. Undaunted, a team of researchers looked to the tools of computer science to assist in the difficult task of decoding leaf information.

“We’re trying to put together a picture of the evolution of our green planet and the plants on which all our lives depend,” says Peter Wilf, a professor of geoscience at .

In tandem with computational neuroscientist Thomas Serre and other colleagues, Wilf has trained a computer to teach itself to identify leaves. Partially funded by the US , their breakthrough promises to open a new window into the last 135 million years since flowering plants (angiosperms) evolved and came to dominate the earth.

Botanical reality

Consider your experience in the woods on any given day in the summer — you see a green world. And just as leaves dominate the landscape, they also dominate the fossil record.

But despite the abundance, attempts to classify and identify leaves constitute a history of errors. It wasn’t until the 1970s that scientists realized 19th century botanists had assigned fossilized leaves from millions of years ago to familiar, living groups (oak, magnolia, etc.) — thus misidentifying many of the leaves entirely.

“Computer vision will change the way we do science.” ~Thomas Serre

With over 250,000 species and 400 families of flowering plants, and each leaf with hundreds and thousands of vein intersections, the difficulty in identification quickly becomes enormous.

“The evolutionary history of so many plants is hidden,” says Wilf. “It’s sitting right there in the museum drawers, but we can’t access it because we don’t have enough tools to identify most of the fossil leaves.”

Teaching a computer to see

One day in 2007, Wilf came across an article that would change the course of his science. Thomas Serre had just perfected a computer vision program that trained a machine to recognize animals from assorted images. Wilf wondered if a similar approach could recognize plants.

Center for Computation and Visualization

7000 cores
700 HPC nodes
128 GB memory nodes
NVIDIA GPU nodes
400+ TB file Storage
40 Gbps Infiniband network
34 TB RAM
125 TeraFLOPS peak performance

“With computer vision and machine learning, we’re on the cusp of a technological revolution,” says Serre. “Computer vision is enabling the development of super-visual senses. I really believe that it will change the way we do science.”

The basic idea behind machine learning is that the computer learns to associate a sample to a category label — a leaf to a name in the angiosperm taxonomy, for instance. It teaches itself the diagnostic features of the category (oak family, maple family, etc.) and then uses those features to categorize unknown leaves.

Using the computers at the , the team presented 7,597 cleared leaf images to the machine. Cleared leaves are specially prepared leaves — bleached or stained — to better see veins and intersections in the leaf body. Through this training, the machine learned to associate new images to class labels — and got it right 72% of the time — or about 15 times better than chance.

What was most surprising to the botanists on the team was that the computer was able to identify leaf images to the family taxonomic level. Family identification is the traditional first step for fossil leaves, which usually represent extinct species. Variation among the hundreds to thousands of species in a family is immense, and computerized identification of leaves above the species level had never been attempted before.

<strong>Placemarker.</strong> Eight major <a href= 'http://www.wikiwand.com/en/Taxonomic_rank'>taxonomic ranks</a> used in biological classification. Courtesy Peter Halasz.

This initial success based on modern leaves gives the researchers great optimism for eventual computer-assisted identification of fossils.

“If you look at different species of plants within a family,” says Serre, “you end up with leaves that are so different and look to the untrained eye like arbitrary rearrangements. Computer vision is enabling us to see things that would be literally invisible to the naked eye.”

Heat maps

Not only have paleobotanists found a way to see the invisible, the computer is teaching overlooked relationships and connections to the botanists.

The computer generates heat maps with red spots indicating the tiny parts of the leaf it thinks identify a certain family.

“This gets really exciting because when you start looking at those maps, as a botanist I do see patterns,” says Wilf. “The computer vision tool has become this wonderful assistant for evolutionary botany and paleobotany and now it's teaching us.”

Don’t worry: the team's leaf reader will soon come to a smart phone near you. But more important to these scientists is the huge trove of data unlocked by computer vision and its application to fossil identification.

And because our entire existence is bound up with the fate of plants, the more we learn about what lives and what dies, the more we know about the world that we live in and how it could respond in the future.

“It’s going to help scientists and nonscientists understand the green world better. Our clothes, our medicines, a lot of our building materials, much of our oxygen, and almost everything that we eat — it comes from the flowering plants. This is the architecture of our world.”

Seeing the invisible history of leaves

Botanical reality

Teaching a computer to see

Heat maps

Share this story

Tags

Join the conversation

Our Underwriters

Categories

Contact

Science Node

Republish

Subscribe to our newsletter

Login to ScienceNode

Seeing the invisible history of leaves

Botanical reality

Teaching a computer to see

Heat maps

Share this story

Tags

Join the conversation

Our Underwriters

Categories

Contact

Science Node

Republish