In case you’ve not seen the social media discussions, this week has been Love Your Data Week 2017. It’s an international week to get people talking online about data management, sharing, preservation, and reuse; this year’s theme was data quality. See the full thread by searching #LYD17 on Twitter and feel free to keep the conversation going! Here’s a quick recap of the week’s discussions…
Monday: know your data quality. Top quality data should be accurate, complete, current, fit-for-purpose, documented, and reproducible/verifiable. How do you address data quality in your discipline? What measures do you take to assure your data quality – do you have a checklist/criteria you can share? Of course, high quality data can still appear poor if it is badly handled – what can we learn from Calling BS examples such as the musician mortality study?
Tuesday: documenting, describing, defining. Good documentation tells people they can trust your data by enabling validation, replication, and reuse. It also makes the analysis and write-up stages of your project easier and less stressful, so do it as you go wherever possible. Some tips we looked at included a guide on using spreadsheets for scientific data – do you follow this advice? We also shared a video explaining data dictionaries, which are really useful for spreadsheet data or data containing lots of variables.
Wednesday: good data examples. We looked at the FAIR acronym: data should be Findable, Accessible, Interoperable, and Reusable. Data should absolutely be findable, to prevent it being lost, and this means keeping it in a trusted digital repository, not on a website or personal drive. This 2009 ScienceDirect article links to underlying data in section 2.6… to an empty folder. Oops! But don’t worry, this story has a happy ending – the author later deposited the data in a safe repository, Dryad. We have CORD for you to use.
Thursday: finding the right data. This theme looked at the joys of discovering data – have you been through the hassles described in The Patience of the Data Hunter? And what do you think of this guide to finding data? Do you have a preferred procedure or favourite sources?
Friday: rescuing unloved data. Securing legacy data takes time, resources and expertise but is worth the effort as it can enable new research. The steps tend to be: recover, inventory, organise, assess, describe, digitise, review, and deposit. We looked at a few stories of data rescue, such as the Data Refuge project in environmental studies, or this article on climate data rescue. Did you know that there is even an International Data Rescue Award in the Geosciences? Check out 2016’s winner! And of course there are many lessons to be learned from Rogue One – I certainly wouldn’t trust the Empire with my data!
We hope this week’s resources have been useful, but don’t think you have to wait until next February to chat about data issues – get in touch with us any time at firstname.lastname@example.org, tweet us @KNL_MIRC, or of course comment on the blog itself!
Images reused with permission from loveyourdata.wordpress.com