by Katherine Akers
Academic librarians motivated to preserve the scholarly record of their institutions may promote the deposit of research data into institutional repositories instead of relevant disciplinary repositories, resulting in competition between the two types of repositories. Rather than being a competition, however, the relationship between institutional and disciplinary data repositories could be mutually beneficial to both universities and larger research communities.
That is, institutional, national, and international data repositories could be considered as ascending tiers of a ‘Data Pyramid’, with institutional repositories collecting a large swath of datasets that might otherwise be discarded or lost, and national or international repositories committing to ensuring access to datasets with the highest value, thereby increasing the visibility of the data to relevant communities of interest. This positive outcome depends on active partnerships between academic institutions and disciplinary data repositories. For instance, by virtue of being close to the source of research data, academic librarians or other local data curators could work directly with researchers to process and review data, create metadata and provide contextual information, and ingest data into institutional repositories, after which ‘archive-ready’ data packages could be pushed into disciplinary repositories.
Here at the University of Michigan, we think that the most effective way to support data sharing and preservation is to provide an institutional data repository as well as to facilitate the use of disciplinary data repositories. At the upcoming International Digital Curation Conference (IDCC), Jen Green will be presenting a paper authored by her and myself entitled ‘Toward a Symbiotic Relationship between Academic Libraries and Disciplinary Data Repositories: A Dryad and University of Michigan Case Study’, in which we propose three concrete ways in which our library could partner with Dryad, a repository for data associated with articles published in science and medical journals.
First, the library could become a Dryad member organization, allowing us to nominate and elect Dryad Board of Director members and vote on amendments to bylaws at the annual Dryad membership meeting. As the majority of current member organizations are journal publishers, the inclusion of academic libraries into the circle of Dryad members would widen the pool of stakeholder perspectives that guide the future of Dryad and shape the research data ecosystem.
Second, by becoming a Dryad member organization, the library could take advantage of a discounted pricing plan to financially assist researchers with submitting data to Dryad. That is, we could purchase vouchers that would cover the cost of future data submissions and establish a system for distributing those vouchers on our campus.
Third, the library could host a local Dryad curator. This curator could be trained in the Dryad data curation workflow, identify datasets that could be submitted to Dryad, and remotely ingest data into Dryad. Because this local curator could interact directly with campus researchers, he or she could also increase the likelihood that data can be used by others in the future by ensuring that data are adequately described in codebooks or ‘readme’ files, converting data files into non-proprietary formats, and verifying the completeness of data.
Researchers tend to align with their disciplinary communities more than their institutions, and disciplinary repositories may make data more visible to relevant communities of interest. Therefore, libraries should go beyond promoting the deposit of data into institutional repositories and actively seek to partner with disciplinary data repositories, which could make a significant contribution to capturing and preserving important research datasets that otherwise might be discarded or lost.