• Subscribe

Cracking the coffee code

Speed read
  • Sequencing the genome of Arabica coffee aids research leading to more robust plants
  • Genome browsers allow plant scientists to integrate and visualize agricultural knowledge with highly detailed genomic data
  • National Center for Genome Analysis Support (NCGAS) enables state-of-the-art genomic research collaborations among institutions

Imagine your morning cup of coffee: a rich, steaming elixir of energy, happiness, and the will to face the rest of your day.

<strong>Rusty cup.</strong> The plants that produce flavorful C. arabica beans are vulnerable to diseases such as coffee rust. Courtesy Keithanne Mocakaitis.Today the beans in your cup are most likely Coffea arabica. Unlike C. canephora, the ‘robusta’ species most commonly used in powdered products, C. arabica is generally considered a more flavorful coffee. But the plants are prone to devastating diseases, including coffee rust.

A form of fungus, coffee rust can attack suddenly and ruin a farm’s entire crop. And climate change only makes it worse. As night-time temperatures rise in coffee-growing countries, even high-altitude growing areas are vulnerable to crippling die-offs of this perennial plant that should be productive over spans of 10 to 20 years.

A world without coffee -- it’s unthinkable. The millions of workers employed in the industry would lose their livelihoods, and international productivity would take a nosedive.

Which is why scientists like Keithanne Mockaitis are working to sequence and annotate the entire genome of C. arabica and share the findings with other researchers.

“Our hope is that this information will be used by plant breeders to quickly and efficiently select for traits that will help C. arabica survive,” says Mockaitis.

Cracking the coffee code

Thanks to ambitious ventures like the Human Genome Project (1990-2000), scientists’ ability to work with strings of DNA has come a long way in a few short years.

<strong>International collaboration.</strong> From left: Alvaro Gaitan, Director of Cenicafe; Keithanne Mockaitis; Carrie Ganote and Sheri Sanders, NCGAS; Hernando Duque, COO of Coffee Federation of Columbia, Aleksey Zimin, John Hopkins University.“Technology advances in both data generation and in the software for assembly have just exploded in the last ten years, and now we’re able to tackle the sequencing and assembly of more complex things,” says Mockaitis.

Arabica coffee is one of those complexities. Unlike the simpler robusta plant, a version of which was sequenced in 2014, C. arabica is tetraploid, meaning it has four copies of each of its chromosomes. 

That adds up to a lot of data that must be extracted and then put together like the pieces of a puzzle. Since there’s no existing technology that can read a chromosome from A to Z, researchers must instead chop the DNA into pieces and then sequence the pieces.

Mockaitis’s collaborators have written software and continually improve it for assembling the puzzles of large, complex genomes. In this case, Aleksey Zimin of Johns Hopkins University is the DNA puzzle master of the coffee genome.

Mockaitis’s group at Indiana University (IU) puts together the RNA sequences to identify the ‘active’ parts of the genome and marks up the final reference with as much detail as possible. This allows plant researchers to learn the location of genes and how much they differ among coffee types. In addition, Coffea species other than the tetraploid arabica are also being sequenced for comparison studies.

<strong>Genome browser.</strong> Sheri Sanders and Carrie Ganote of NCGAS demonstrate the latest functions of the genome browser they developed to Claudia Flores, director of plant breeding visiting from Cenicafe, Colombia. Courtesy Marcela Yepes, Cornell University.Throughout the project, experts from the National Center for Genome Analysis Support (NCGAS) install advanced genomics software on IU’s high performance computing systems and expand their training for tackling the challenges of genomic research.

Thanks to a $1.5 million grant from the National Science Foundation (NSF), consulting expertise from NCGAS staff, computational time, storage of data and experimental results, and customization of open source software are made available at no cost to researchers with current NSF funding.

“The computing power, storage, file systems, and software support needed for projects like this are enormous,” says Mockaitis. “It's pretty rare that a biology department or a smaller university would be able to do any of this on their own—that’s who NCGAS was developed for.”

Mockaitis’ projects also use national resources provided by the Extreme Science and Engineering Discovery Environment (XSEDE).

“We have changed and developed our strategies for computing and storage over time, as new resources become available,” says Mockaitis. “The expertise of NCGAS has been absolutely essential to customizing and troubleshooting these resources to serve our projects.”

Out of the lab and onto the farm

When Mockaitis and her team have sequenced, assembled, and annotated a genome, they disseminate the information in a visualization known as a genome browser.

This is the age of big data in biology and that's exactly why places like NCGAS exist, says Mockaitis.

The browser is an interactive website, hosted by NCGAS, offering access to genome sequence data integrated with a large collection of aligned annotations. Plant breeders who have a knowledge of classical genetics can use the genome browser to relate the new information to known DNA markers and map them together.

<strong>Test bed.</strong> Plant breeders use the sequenced genome to experiment with new ideas to improve the resiliency of C. arabica plants. Courtesy Keithanne Mockaitis.Mockaitis works with collaborators at Cornell University and with Cenicafe, a coffee research center in Colombia, a major coffee-growing region. Scientists there use the browser to zoom in on parts of the genome and propose experiments to test different ideas. The collaboration overall provides a wealth of data that allows scientists to design new research approaches.

“Providing a high quality sequence reference doesn't really answer any one question by itself,” says Mockaitis. “It kicks off a whole new wave of research into coffee.”

Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2018 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.