Last week we ran our first webinar on ‘How to write a data management plan’ (a DMP). Writing a DMP is a relatively new requirement, though some research councils have been requesting them with funding proposals for several years. Recently Cranfield University made it mandatory for doctoral students (registered from 1 October 2016) to write a DMP for their work, to help them best plan an efficient project, and of course to learn the useful skill of writing one.
The webinar format meant it was quite different from the workshops, without hands-on exercises, but allowed for more time for questions and discussion on the content. You can see the session materials online or, of course, Cranfield researchers can sign up for a webinar or workshop in the DRCD diary for students or the L&D diary for staff. During the webinar, there were some good questions from delegates, with two key discussion points around choosing data to share, and planning the timeframe.
It is often tricky to determine whether to share input and/or output data, raw and/or derived data. It can help to consider why you’re sharing data and some points we discussed covered:
- If it’s important that your research is reproducible and can be validated, early input data could be the most crucial. However, if your research was really a modelling exercise, and the data is less important than the model/algorithm/software it was used to test, then sharing the data itself may not be essential.
- If the data has good re-use potential, it must of course be clearly understandable and accurate. If there was significant transformation carried out to get from raw to processed data, where errors may occur, it may be more appropriate to share the raw data, or perhaps clarify how the derived data was generated and the quality control procedures in place. This is important anyway as a sign of transparent and robust research processes.
- Don’t forget about ethics – sometimes your decision is made simple by the fact that the raw data contained personal information so cannot be shared, and your choice is made in line with data protection and consent agreements.
- You might also like to look at our data selection and appraisal checklist (pdf – internal only).
Regarding timeframes, how can you say for certain when you will publish data? You may want to publish articles and therefore not release your data too soon. This is absolutely true and highlights why DMPs are seen as living documents, which you may need to edit when reviewing them in project meetings. If you feel you can write multiple articles on your research, then you can plan to release the data in chunks as it underpins the articles being published. Similarly, if you realise you can commercialise the research, you may need to change your plans from aiming to release all data immediately on completion of the project, to releasing it under an embargo to allow for the exploration of commercialisation or patenting first.
Do you have advice from experience on either of these aspects? Are there any other areas you find particularly interesting in a data management plan?
CC0 image from stocksnap.io