Where should I put my research data?
15/09/2016
If you are mandated or able to share your research data, one immediate question is where do you put it? It can be confusing as there are a number of options available to you. As a general rule, if your funder provides a data repository, use that to ensure you meet obligations (eg NERC research outputs should use NERC data centres). If not, but there is a trusted subject repository that is the norm for your domain, use that because that’s where interested parties will look. If there’s neither a funder nor a subject repository, your main option is CORD, Cranfield Online Research Data. However, there are other sites out there you may have heard about, so let’s have a quick look at the differences so you can choose the right one for your situation.
CORD uses the figshare platform, and Cranfield University has an institutional contract with figshare guaranteeing elements such as the location of the data storage (European servers with the security required by law and funders) and that our data is freely accessible to us at any point (so we can move it all in bulk without charge if we need to). CORD is designed to meet funder expectations easily: it provides an online record describing the data, and access to the files can be open or restricted. We are also looking into long-term digital preservation of CORD items so that items are accessible and intelligible in decades to come, to meet another funder requirement; perhaps we’d need to change file formats, for example, to remove software dependencies. figshare is also optimised for Google and included in the popular re3data index of research data repositories and can be indexed by tools such as Jisc’s upcoming research data discovery service. All this maximises visibility of your very valuable research outputs.
Mendeley Data is an Elsevier service that we have no contract with, so terms and conditions should be checked when using it. Currently, for example, the terms show certain restrictions on retrieval of data from the site so it may be risky to share data only on this platform. Whilst Mendeley Data currently stores data in Germany, we don’t have an agreement in place to ensure no data would be moved outside the EC; so be careful especially if you’re working with personal data, as this could put you in breach of contracts. Mendeley Data has very good flexibility in access restrictions, but note that there is no institutional check, unlike CORD – our review step allows us, for example, to note if data has commercial input and verify contracts to ensure the access is appropriate for all partners. Expectations of RCUK funders such as EPSRC also include that a central contact address is used for requests to access data, rather than contacting an individual researcher; if data is on Mendeley Data and Cranfield receives a request for access when the researcher is uncontactable or no longer at the University, we are in a sticky predicament, so publicly funded research data shouldn’t exclusively be on Mendeley Data.
ResearchGate is similarly outside Cranfield’s control and is a commercial service designed for social networking, not for data storage, although that is a secondary feature. The terms note that the service can change, stop being free, or cease at any point (with no simple export of data, so uploaded content may be lost), so its mission is clearly not for long-term preservation, which makes it unsuitable for many data outputs. Indeed, you may be in breach of copyright by posting items there (and the indemnification in the terms means you will pay any legal costs). Due to its mission of social networking amongst academics, not everyone can register to use it, so the audience your data would reach is restricted. The University of California blog post A social networking site is not an open access repository is worth a read for a more detailed discussion. ResearchGate shouldn’t be used to preserve RCUK-funded data, but it may be valuable as a networking tool to promote your outputs.
GitHub is a code development site with excellent features in that area, such as detailed git version control. Data and code can be stored here and this may be appropriate when you are actively working on it. However, it is again an external system that may cease at any point and does not guarantee security or return of any data on it. It does not provide a DOI for items, either, which is an RCUK requirement and a good idea because this ensures tracking of usage and citation stats. Especially if developing code, you may find that it is best to use GitHub throughout your project, and then when needing to publish a final (or interim) version of your code or dataset, this output is moved to CORD to get a DOI and be cited in the paper/s it underpins. Integration between CORD and GitHub is being developed to make this process more seamless.
All in all, these are different systems intended for slightly different purposes which you may benefit from using in different situations. When it comes to long-term storage of your research outputs, we strongly recommend CORD if you have RCUK funding; it was implemented so you have a secure and compliant repository, designed for preservation and visibility, that ensures you meet obligations around data management and sharing.
CC0 image from pixabay.com.
Categories & Tags:
Leave a comment on this post:
You might also like…
My Apprenticeship Journey – Broadening Horizons
Laura, Senior Systems Engineer at a leading aircraft manufacturing company, joined Cranfield on the Systems Engineering Master’s Apprenticeship after initially considering taking a year off from her role to complete an MSc. Apprenticeship over MSc? ...
The Library app is back!
The Library app is back! It's exactly the same as before (although it will get a fresh look in a few months) and if you hadn't removed it from an existing device it should just ...
PhD researcher at the IF Oxford Science and Ideas Festival
IF Oxford is a science and ideas Festival packed with inspiring, entertaining and immersive events for people all ages. PhD researcher, Zahra attended the festival. Here she shares what motivated her to get involved. ...
What leadership skills are required to meet the demands of digitalisation?
Digital ecosystems are shifting the dynamics of the world as we know it. With digitalisation being a norm in the software industry, there is currently a rapid rise in its translation ...
My PhD experience within the Centre for Air Transport at Cranfield University
Mengyuan began her PhD in the Centre for Air Transport in October 2022. She recently shared what she is working on and how she has found studying at Cranfield University so ...
In the tyre tracks of the Edwardian geologists
In April 1905 a group of amateur geologists loaded their cumbersome bicycles on to a north-bound train at a London rail station and set off for Bedfordshire on a field excursion. In March 2024 a ...