Tracking scientific output across the web

The ODIN project first year conference was held in CERN's Globe of Science and Innovation on 17 October, 2013. Image courtesy CERN.

Scientific output is not just about academic papers. Datasets, along with other products of research, such as software and various forms of multimedia, also need to be made citable so that sharing and reuse can be facilitated, as well as tracked. In order to achieve this, these outputs must be given persistent identifiers similar to the digital object identifiers (DOIs) assigned to academic papers. Two weeks ago, the ODIN project held its first-year conference at CERN, with delegates discussing the barriers stakeholder groups encounter and how these could be resolved to enable an interoperable layer of persistent identification on both a European and a global scale.

ODIN stands for 'The ORCID and DataCite Interoperability Network'. It is a two-year project, which started in September 2012 and is funded by the European Commission under its FP7 scheme. DataCite works to assign persistent identifiers to datasets, while ORCID focuses on providing persistent identifiers to researchers themselves, so as to solve issues related to name ambiguity. Both organizations have the overarching goal of facilitating better scientific research.

These efforts are of particular relevance to large research organizations like CERN, where some high-level datasets have recently been made available to other scientists for re-use. Equally, the large numbers of collaborators working on the experiments on the Large Hadron Collider can make correct attribution of articles and datasets challenging, too. Sünje Dallmeier-Tiessen, part of CERN's Scientific Information Service group, also reported at the event on efforts in INSPIRE, the high-energy physics information hub, to accurately attribute articles and now published data to their authors.

"DataCite seeks to make research better by enabling people to find, share, use, and cite data more easily," explains Adam Farquhar, the organization's president. "We engage researchers, scholars, data centers, libraries, publishers, and funders through advocacy, guidance and services." Laurel Haak, executive director of ORCID agrees with Farquhar on the importance of working with a wide range of stakeholders to improve the use of persistent identifiers for researchers: "Of course, we need to engage with researchers, but we also need to work closely with the institutions and organizations that support researchers."

Haak was one of several delegates at the conference who highlighted the difficulty of convincing time-pressured researchers to adopt persistent identifiers, such as those issued by ORCID. "The social challenges we face are definitely harder than the technical ones," she says. "We need to understand how to get people to use these tools and that's going to be a lot of hard work." Mark Hahnel, founder of FigShare, described these social challenges in blunt terms: "Shouting at researchers with white papers doesn't make them cite data in the reference list," he says.

Antonios Barbas, a project officer for data infrastructures within the European Commission's e-infrastructure unit, also spoke at the event. "Research cultures can be very diverse and difficult to change, but this is why we really need community-driven initiatives such as these," he says. In addition, Barbas spoke of the implications of persistent identifiers for European-wide drives towards open-access and emphasized the role that funding agencies have to play. Haak believes that funding agencies have a fundamental part to play in driving the cultural change necessary for widespread adoption of persistent identifiers within the community. "ORCID IDs could save researchers a lot of time by reducing the amount of manual post-grant reporting which needs to be done," she says. "There are big advantages for researchers, and we need to make adoption as easy as possible by embedding identifiers within regular research workflows."

Salvatore Mele, ODIN project member and head of open access at CERN, says that more incentives are also needed to encourage researchers to make their datasets reusable. "Policies to encourage data sharing and acknowledge data re-use in research assessment are not yet widespread," he says. "Unique attribution and linking between researchers, their scholarly materials and funding is not possible without a collaborative adoption of global interoperable persistent-identifier systems. This, in turn, will generate the right incentives for more sharing and more open science."

