Data is an integral part of scientific research. With a rapid growth in data collection and generation capability and an increasingly collaborative nature of research activities, data management and data sharing have become central and key to accomplishing research goals. Researchers today have variety of solutions at their disposal from local storage to Cloud based storage. However, these solutions solely focus and rely on hierarchical file and folder organization. While such an organization is pervasively used and quite useful, it relegates information about the context of the data such as description and associated collaborative notes to external systems, dispersing this vital information into different silos not only impedes the flow research activities in near term, but also has an impact on mid and long term retention of knowledge about intermediate steps.
In this workshop, we will introduce and provide hands on experience with tools designed to mitigate this critical gap via the NSF supported SeedMe2 platform. The SeedMe2 platform leverages the familiar hierarchical file and folder organization structure, but extends it with an ability to add data, its description and discussion in one system. It also allows a folder to be shared either privately with collaborators or publically for wider dissemination of information. Users may interact with the system via the web browser, command line utility or via REST API using familiar concepts for each method. The platform will enable users to rapidly share and access transient data and preliminary results with collaborators in consumable form. The workshop aims to provide practical training to customize and utilize this infrastructure and enable attendees to overcome existing gaps in collaboration as well as realize several aspects of research data management.
Note: SeedMe2 platform focuses on data that can be transferred easily on the web with standard tools such as stock Web Browsers. This limits the upload sizes to 2GB per file, however any number of files may be uploaded. Moreover derived products from large scale raw data tends to be small, so this workshop and platform is still highly relevant to large data producing groups. In future the platform will likely support larger size uploads.
Aug 24, 2018 7:00:00 AM
San Diego Supercomputer Center
Amit Chourasia and David Nadeau
San Diego Supercomputer Center, UC San Diego