Many funders and publishers now ask for research data to be shared at the end of projects, but researchers often understandably have concerns about making data open. Is it beneficial and worth the effort it takes? Is it even appropriate, if there were human participants or commercial input?
What’s the benefit to sharing – for me?
Fundamentally, it’s good for your reputation. Making data open is a simple way to demonstrate scientific integrity, as others can validate your results, just as best practice expects you to share your methodology. (Indeed, recently the Netherlands committed €8m to investigating the reproducibility crisis, which shows the increasing concerns over research conduct.)
Furthermore, your data is a valuable research output in its own right, so sharing it appropriately allows you to be fully credited for it (see our post on data citation). Several studies have also shown that there is a robust citation benefit to open data* so data sharing contributes to increased academic impact.
And finally, data sharing also opens doors to increased collaboration within or across disciplines and institutions, enabling new opportunities in collaboration with others across the globe. The power of combining datasets can indeed enable potentially valuable new research and discoveries (such as the Alzheimers breakthrough).
What about data with commercial input or human participants?
Whilst funders and publishers often expect data to be shared, they want to maximise the public benefits of research, without damaging the research process. RCUK and others acknowledge that there are ethical, legal, and commercial constraints on the release of data, and these should be considered carefully. The best approach is to aim to make data open, then consider barriers and whether these can be eliminated; if not, what is the best access restriction to apply?
For example, perhaps only a subset of the data can be released, as the rest is highly sensitive; alternatively, EPSRC expectations would suggest that commercially confidential data might be made available subject to a suitable non-disclosure agreement. Or if some data can’t be open because commercialisation is in progress, then releasing data with an embargo may be an appropriate solution. You have the right to exploit your data first, and you must adhere to contracts with partners (which should be designed with data sharing expectations in mind).
More generally, restricting access to data in any way is perfectly acceptable as long as the reasons are clearly justified; they should be discussed from the outset of the project in the data management plan. But barriers and restrictions can be confusing, so at Cranfield University we have a Research Data Manager who works with Contracts and the Research and Innovation Office as well as researchers, to ensure that a compliant level of access is used when publishing (or choosing not to publish) datasets. Just email firstname.lastname@example.org to start a conversation about what level of sharing is responsible for your data.
*See other highlights from six studies on open data and citations in the Available Online blog post.
Image: Sharing by ryancr, CC-BY-NC 2.0.