Feature - Collaboration matchmaking in VIVO
It's tough to find a good match, but that doesn't stop some from logging onto social networking sites in search of one. We all want to find that special someone who complements our strengths and shares our interests.
That's why Michael Conlon hopes scientists around the world will someday log into VIVO: to find that special someone - or someones - with whom they can collaborate. Meet VIVO, a distributed social network for scientists that will leverage the semantic web to bring an entirely new approach to an old problem - finding the right partner for your research.
Skeptical? The National Institute of Health, via the American Recovery and Reinvestment Act, has about $12.2 million worth of confidence in the project, awarded to the seven participating research institutions in a two year grant.
Although the seven funded institutions, led by the University of Florida - Conlon's home institution - are all American, the development of VIVO has been inclusive.
"I think two thirds of our technical advisory board is international," Conlon said. "It's pretty well known what we're doing - science is an international activity."
A social network, or a framework?
Although a great deal of the press VIVO has received describes the project as a social network, that descriptor is somewhat misleading. The truly innovative part of VIVO is in the underlying architecture; the applications that combine that architecture with social networking technology are just the logical progression.
"The VIVO approach is fundamentally new," said Conlon, the principal investigator for VIVO. "VIVO is distributed. The data are stored in institutional databases that are separate from the application. That means that the faculty don't have to put their data in a variety of different systems."
User profiles, stored at academic research institutions, will contain "see also" pointers to other VIVO servers containing information about the pertinent user. A trusted application or website could follow the trail of pointers from VIVO server to VIVO server, grabbing data along the way, and use the data to assemble a seamless profile on the fly. One interesting side effect is that the institutions that run VIVO servers are essentially providing their researchers with credentials that can be used in transactions, or to authenticate their identity.
You call it ontology, I call it semantics
The sort of information your research institution stores is certainly useful for populating a website profile. If scientists are going to use these profiles to find their perfect match, however, they'll need a little something else - a list of their interests. But hold on. We've all seen what a mess it can be when websites let people use whatever interests they like. Some of the keywords people come up with are downright confusing, and the plentiful supply of synonyms available for use in the English language make it nigh-impossible to find that ideal search term that will yield no false positives or negatives.
To keep that from happening, "they're going to express their interests using controlled vocabularies. If the keywords are in a controllable vocabulary, now we have meaning," Conlon said.
A vocabulary will have to be extremely fine-grained to be useful to scientists searching for people with whom to collaborate. After all, it's already trivial to find another computer scientist. Finding someone who is also working on computationally intensive applications for geologists is quite a bit more difficult.
That's why VIVO will involve working with information science and ontology experts to create the controlled vocabulary Conlon mentioned. That may seem intimidating - or time consuming - to a lot of scientists. But Conlon has an answer for that too. "We're trying to create a facilitation model where academics are assisted by research librarians," Conlon said.
Doing a good job at listing research interests could have a number of benefits beyond the obvious 'matchmaking' motive. For example, research institutions could route information by interest instead of relying on large email lists. VIVO users could turn the 'volume' up or down.
"If you turn up the volume, you'll get some things that are tangentially interesting to you," Conlon explained. If you turn it down, you'll get fewer emails, but you might miss something you consider relevant.
Okay, so there is a social network involved
Reporters who likened VIVO to Facebook may be gratified to hear that those websites and applications could be social networks.
But, "not every application is going to be created by the VIVO team," Conlon said. "VIVO is attempting to create a framework that will be used by a number of projects."
Katy Börner is the principal investigator and leader of the Indiana University team charged with creating VIVO's official social network.
Since the VIVO framework can track research interests, research groups, collaborations, and more, there's a lot of potential for automatically populating social network profiles. The hope is that if researchers can use their existing credentials to log into the network, the network will develop into a popular destination for scientists wishing to seek out colleagues with similar or complementary interests. That perfect match that will encourage truly inspired interdisciplinary inquiry into the mysteries of the universe.
In addition to creating a social networking site that is intended to become a hotbed of scientific collaboration, the Indiana University team will also provide a monitor of VIVO activity that will hopefully include real-time visualization.
Börner, as it happens, has an ulterior motive for working on VIVO. She is one of many scientists who hope to use VIVO to gain an understanding of how science and social networks shape each other.
"I think we can create better tools to understand how social networks exist in science," Börner said. "In order to do this in a systematic way you need good data." Unfortunately, most of that data is in proprietary databases, making it difficult or impossible to replicate the few studies that have been done. VIVO could change that.
"It's a very powerful and very precise method for building information about science," Conlon said.
Speaking of networking…
The VIVO team has been busy networking with other projects and groups in the hopes of forming partnerships. Connecting with existing science portals such as HUBzero, or with existing cyberinfrastructures such as Open Science Grid and TeraGrid are real possibilities, according to Conlon.
"We are involved with InCommon," Conlon said. "We do expect a federated identity and federated access is part of the proposal."
The trick will be to find a security protocol that is functional but works for most participating institutions.
The question, according to Conlon, is: "What does it take to have a well-managed identity management system so that credentials produced at a university have some standing with others?" One solution may be what the federal government calls 'levels of assurance.' "InCommon has levels of assurance that match up with several definitions of level of assurance," Conlon said.
One group they are collaborating with is the Center for Dental Informatics at the University of Pittsburgh, which competed with VIVO for the NIH grant. The CDI's Digital Vita system has received funding via the university's Clinical and Translational Science Institute, which received a five-year NIH grant of $83.5 million beginning in 2007.
"This is a stand-alone project," said Titus Schleyer, the principal investigator for Digital Vita. Nonetheless, "we are working on getting our application to be compatible with the whole VIVO project."
Digital Vita is a social network based on the idea that we must all maintain our curriculum vitae. Users enter their CV-relevant information into the system, and Digital Vita builds a network from the collaborative relationships on papers, grants, presentations, and abstracts.
"VIVO is starting with sources of data available about individuals, both within and outside institutions. Digital Vita also uses PubMed and grants databases, but in essence is built on everything that's in a person's curriculum vitae," Schleyer explained. "Both systems have slightly different functions and different amounts of data in them… They're simply two ways of looking at data."
Getting the two data sets to match up is no easy task. "Currently the key issue is to make our data set align with the ontology that is inside of VIVO," Schleyer said. "It's about aligning how data is represented in the system, so that when we publish our data, it is with the exact same meaning and format as VIVO does."
Although Schleyer's background is in dental informatics, it's clear that he's very excited about the potential of projects like Digital Vita and VIVO.
"There is a lot of efficiency that can be generated from being smarter about information sharing, collaboration, and knowledge management. We are very much at the beginning of constructing useful systems for scientists to interact with each other. We're kind of on the verge of creating a new class of computing applications," Schleyer said. "Part of my interest in this project is contributing to that whole revolution - if there's going to be one."
The focus of Börner's excitement is a little more practical. "I can't wait to get it because I can press a button and get my CV out," she said, adding, "I think it will also be very interesting to see who else is doing relevant work at IU and how I am connected as an investigator to them."
-Miriam Boon, iSGTW