If you are mandated or able to share your research data, one immediate question is where do you put it? It can be confusing as there are a number of options available to you. As a general rule, if your funder provides a data repository, use that to ensure you meet obligations (eg NERC research outputs should use NERC data centres). If not, but there is a trusted subject repository that is the norm for your domain, use that because that’s where interested parties will look. If there’s neither a funder nor a subject repository, your main option is CORD, Cranfield Online Research Data. However, there are other sites out there you may have heard about, so let’s have a quick look at the differences so you can choose the right one for your situation.
CORD uses the figshare platform, and Cranfield University has an institutional contract with figshare guaranteeing elements such as the location of the data storage (European servers with the security required by law and funders) and that our data is freely accessible to us at any point (so we can move it all in bulk without charge if we need to). CORD is designed to meet funder expectations easily: it provides an online record describing the data, and access to the files can be open or restricted. We are also looking into long-term digital preservation of CORD items so that items are accessible and intelligible in decades to come, to meet another funder requirement; perhaps we’d need to change file formats, for example, to remove software dependencies. figshare is also optimised for Google and included in the popular re3data index of research data repositories and can be indexed by tools such as Jisc’s upcoming research data discovery service. All this maximises visibility of your very valuable research outputs.
Mendeley Data is an Elsevier service that we have no contract with, so terms and conditions should be checked when using it. Currently, for example, the terms show certain restrictions on retrieval of data from the site so it may be risky to share data only on this platform. Whilst Mendeley Data currently stores data in Germany, we don’t have an agreement in place to ensure no data would be moved outside the EC; so be careful especially if you’re working with personal data, as this could put you in breach of contracts. Mendeley Data has very good flexibility in access restrictions, but note that there is no institutional check, unlike CORD – our review step allows us, for example, to note if data has commercial input and verify contracts to ensure the access is appropriate for all partners. Expectations of RCUK funders such as EPSRC also include that a central contact address is used for requests to access data, rather than contacting an individual researcher; if data is on Mendeley Data and Cranfield receives a request for access when the researcher is uncontactable or no longer at the University, we are in a sticky predicament, so publicly funded research data shouldn’t exclusively be on Mendeley Data.
ResearchGate is similarly outside Cranfield’s control and is a commercial service designed for social networking, not for data storage, although that is a secondary feature. The terms note that the service can change, stop being free, or cease at any point (with no simple export of data, so uploaded content may be lost), so its mission is clearly not for long-term preservation, which makes it unsuitable for many data outputs. Indeed, you may be in breach of copyright by posting items there (and the indemnification in the terms means you will pay any legal costs). Due to its mission of social networking amongst academics, not everyone can register to use it, so the audience your data would reach is restricted. The University of California blog post A social networking site is not an open access repository is worth a read for a more detailed discussion. ResearchGate shouldn’t be used to preserve RCUK-funded data, but it may be valuable as a networking tool to promote your outputs.
GitHub is a code development site with excellent features in that area, such as detailed git version control. Data and code can be stored here and this may be appropriate when you are actively working on it. However, it is again an external system that may cease at any point and does not guarantee security or return of any data on it. It does not provide a DOI for items, either, which is an RCUK requirement and a good idea because this ensures tracking of usage and citation stats. Especially if developing code, you may find that it is best to use GitHub throughout your project, and then when needing to publish a final (or interim) version of your code or dataset, this output is moved to CORD to get a DOI and be cited in the paper/s it underpins. Integration between CORD and GitHub is being developed to make this process more seamless.
All in all, these are different systems intended for slightly different purposes which you may benefit from using in different situations. When it comes to long-term storage of your research outputs, we strongly recommend CORD if you have RCUK funding; it was implemented so you have a secure and compliant repository, designed for preservation and visibility, that ensures you meet obligations around data management and sharing.
CC0 image from pixabay.com.