BioVeLis a virtual e-laboratory that supports research on biodiversity issues using large amounts of data from cross-disciplinary sources.
Virtual e-laboratories are computing environments that bring together the techniques for managing, manipulating and viewing data with the software tools for analyzing that data in pursuit of some aim. BioVeL aims to be a general purpose virtual laboratory to support biodiversity science. Specializations of it support different aspects of the discipline, as this article explains.
BioVeL allows scientists to: (1) construct and execute scientific workflows to repeatedly and accurately perform data analytical tasks; (2) discover relevant, useful and reliable analytical functions ('web services') from multiple independent sources that can be combined together in bespoke workflows; (3) share, find and re-use workflows and build communities of interaction, collaboration and best practice.
"Combing biodiversity web services in workflows is also an excellent opportunity to assess the usability of the existing biodiversity informatics infrastructures developed over the last 20 years. BioVeL identifies technical shortcomings of service implementations and provides the ideal testing framework for improving their fitness for use in a real-world scientific context."
- Anton Güntsch, head of the biodiversity informatics research and development group at the Botanical Garden and Botanical Museum Berlin-Dahlem, Germany.
This ambitious European program has clearly defined three main objectives: (1) bring together biodiversity scientists and informatics engineers to respond to the practical technical needs of research on hot environmental topics such as the impact of climate change on species and ecosystem services, invasive species, preservation and sustainable use of resources; (2) create adequate and flexible informatics tools to support the treatment of data, in open access, maximizing already existing software and adapting it to the needs, with a user-friendly interface; (3) contribute and facilitate the sharing of these tools and their re-use.
Using myGrid tools, after only 18 months BioVeL offers computerized 'workflows' (series of data analysis steps) that address three areas of needs in biodiversity research: data refinement workflow, ecological niche modeling workflow, and matrix population modeling workflow.
Data refinement (a.k.a. 'data cleaning') is a well-known issue for scientists collecting and pooling data from various origins. Data collection is a messy, error-prone process , from mere typos to deeper taxonomic issues, such as synonymy and translation, as well as standardization of the data collection format. BioVeL workflow deals with this issue in three steps: (1) pulling data sets from existing libraries (i.e.GBIF) and/or your own, (2) 'refining' the data for synonymy and spelling errors, and (3) selecting data through distribution maps. Our first benchmark study of our data refinement workflow shows that our tool goes almost 10 times faster than other traditional methods.
"The workflow approach is very appropriate for this type of research [population modeling] because it makes it easy to standardize the analyses, share more complicated simulation modeling scenario's (e.g. incorporating environmental stochasticity or ecological drivers), and link to existing databases that contain projection matrices for hundreds of plant and animal species..."
- Gerard Oostermeijer, University of Amsterdam, The Netherlands.
An interesting feature of our data refinement workflow (DRW) is that it can also be used in historical studies. For example, Matthias Obst, of the University of Gothenburg, Sweden, explains his historical study: "We tested the data refinement workflow with a large-scale comparison of over 50,000 species observation records from two inventories of marine species from the Swedish west coast. These two sets of observations were recorded over 66 years apart (1921-1941 and 2007-2009). We used the DRW functions for data cleaning, refinement, and taxonomic name resolution to generate comparable data sets for statistical analysis of long term ecological changes. More specifically, we analyzed three things: species richness, species turnover, and geographical distribution of species richness. Using BioVeL's workflow, we could analyze changes in community structure of marine ecosystems over large temporal scales. With this tool, we could go beyond the scope of conventional ecosystem assessments based on monitoring programs."
"Once we had clean data, we could start studying ecological niches of species and develop prediction models of species distribution. To this end, we developed an ecological niche modeling workflow". With this tool, it is easy to select different modeling algorithms and environmental layers. For example, one can model the predicted distribution of a species according to a chosen set of environmental parameters, such as temperature or salinity. Future predictions can also be made using future climate models, to study the effects of climate change on species distributions over broad geographical areas. This tool is currently tested on studies of marine invasive species.
In parallel, we developed workflows for constructing and analyzing matrix population models from demographic monitoring data. Matrix population models form the standard toolkit for integrating complex data on demographic vital rates (survival, growth, and reproduction) measured on individual plants or animals and simulating population dynamics. "The workflow approach is very appropriate for this type of research because it makes it easy to standardize the analyses, share more complicated simulation modeling scenario's (e.g. incorporating environmental stochasticity or ecological drivers), and link to existing databases that contain projection matrices for hundreds of plant and animal species (e.g. the Max Planck's Institute for Demographic Research's COMPADRE and COMADRE databases), for evolutionary studies," says Gerard Oostermeijer from the University of Amsterdam, The Netherlands.
BioVeL's interest involves a strong desire to share knowledge and skills to advance biodiversity research. To this end, it fosters an international community of researchers and partners. You are welcome on our forum at www.biovel.eu. At the core of BioVeL is a consortium of fifteen partners from nine countries, as well as an outer circle of 'Friends of BioVeL.' BioVeL is funded through the European Community 7th Framework Programme. It is part of its e-infrastructures program and is free to use.
We are currently developing workflows for integral projection models (beta release: May, 2013), which use a novel statistical approach to analyze and model population dynamics. Oostermeijer further explains that 'relative to the R-scripts on which they are based, BioVeL workflows are much more user-friendly and do not require knowledge of 'R', and are therefore also accessible to other users than the small community of population biology scientists. Practical applications are found in population viability analyses (PVA's) for endangered species, the design of [cost-] effective eradication strategies for invasive species and improvements of [ecological niche] models projecting the response of species to climate change scenarios."
These products are our first achievements and this year will see the releases of further development in each of these areas as well as others. Indeed we are also moving forward with metagenomics and phylogenetics developments, in particular with the production of new workflows to contribute to the Ocean Sampling Day project. Ecosystem functioning research, in particular through studies on carbon sequestration, is also addressed.
Please note that these products are currently released in beta version, either via a browser (Taverna Lite) or the extended Taverna workbench that requires several installations.