150 years ago, surveyors used trees as plot markers during America's westward expansion. They would meticulously note the dominant trees every mile. These records became a part of what is known as the Public Land Survey - a gold mine of untapped data for researchers who are looking to inform policies on climate change. The Public Land Survey offers a way of comparing the types of vegetation that existed in certain areas of the country 150 years ago with what's there now.
The trouble is, these hand-written records are only available to researchers as scanned images. The old way of preparing this data involved researchers enlisting graduate students to type the records into a database. One such study took 20 years - and that was just for the state of Wisconsin.
Happily, there's a new way of mining this data and thousands of other collections like it: software called Brown Dog, which sniffs out data of all formats and in all scales of space and time, converting it to something easily usable by researchers. The brainchild of a team at the National Center for Supercomputing Applications, Brown Dog is being built in partnership with faculty at the University of Illinois at Urbana-Champaign, Boston University, and the University of Maryland, all in the US.
Supported by a $10.5m US National Science Foundation Data Infrastructure Building Blocks (DIBBs) award, Brown Dog currently provides two tools. The first is the Data Access Proxy (DAP), which transforms unreadable files into readable ones. Researchers need only configure their computer settings once for DAP, after which data requests over HTTP are first examined by the tool to see if the file format is readable. If it's not, DAP works behind the scenes to seamlessly convert the file to a usable format.
The second tool is the Data Tilling Service (DTS), which allows users to search data collections using an example file. It's as simple as drag and drop - after configuration, researchers just place the example in the DTS search field to find similar files on a given website. The tool can also perform general indexing, and pull and add metadata from/to files so users better understand data they've uncovered. And, if it comes across an unreadable file, DTS uses DAP to convert the file to an accessible format.
Ultimately, Brown Dog is designed to grow into 'the super mutt of software', a robust and automated toolset that means data access is no longer limited by outdated formats, or lack of structure or curation. As digital data and technologies evolve, so will Brown Dog.
A team of researchers at Boston University is trying to use historical data from the Public Land Survey to shape their understanding of the land carbon cycle and gain insights about future climate change. Brown Dog is making accessible all those hand-written records from 150 years ago.
"Currently, the terrestrial carbon cycle is one of the largest sources of uncertainty in climate change projections, and so we're trying to reduce those uncertainties," says Michael Dietze, an assistant professor in the Department of Earth and Environment at Boston University. "Uncertainties are policy relevant. They're on the scale that they can potentially affect strategies for how government deals with climate change."
Dietze and colleagues are also working on PEcAn (Predictive Ecosystem Analyzer), an interface for running terrestrial ecosystem models. One of the bottlenecks is getting observational data into the system, so they're using Brown Dog to assemble a diverse collection of data sets that inform the models.
Brown Dog is also helping a group from the University of Illinois collect research on maximizing the benefits of green spaces, with a particular focus on green stormwater infrastructure like rain gardens, green roofs, and trees in urban areas. Cities are spending more and more on green infrastructure, but they often fail to consider the positive effects on residents' health and wellbeing (like less air pollution). Factoring these benefits into the initial design phase could effectively give cities more bang for their buck.
Cities don't necessarily know where green infrastructure has been installed, since many organizations and private citizens are creating these features. By combining high-resolution Lidar data with satellite and aerial imagery and social media postings, Brown Dog tools can help infer where green infrastructure is likely to be.
"Our work involves collecting many different types of data, in all formats, from a multitude of government agencies," says Barbara Minsker, a professor in the Department of Civil and Environmental Engineering at the University of Illinois at Urbana-Champaign. "Brown Dog brings the data to us, transforms it into the form we need, and allows us to use it without spending months and months just getting the data ready."