Data documentation – what and why

Just as a map becomes much easier to read when you have a legend explaining the symbols, data is much easier to comprehend when you have descriptive and contextual information. For example, while you’re using your research data, you’ll know what your variable names are short for, and what different missing value codes mean. But if you want to reuse it in a few years’ time, will you still be certain? Or if another researcher uses your data, will they be able to understand it without needing to ask you questions? Maybe you’ve experienced frustration yourself, when you want to incorporate existing data into your research, but it wasn’t made available with sufficient information and you’ve had to spend time contacting the creator.
Making documentation available alongside your data is therefore a key step to ensuring its long-term usability and value, both for you and others (which might lead to extra citations). Documentation can be as simple as one text file, in which you’ve noted all the key contextual information about the data and the project. Some elements you might want to include are:
- Study-level information
- Data collection methods, such as instruments used, sampling methods, scales, geographic/temporal coverage, etc.
- Data processing methods, such as validation, checking, cleaning, calibration procedures, etc.
- Data-level information
- Names, labels, descriptions, and units for all variables and fields.
- Explanation of codes, including reasons for missing values (not applicable/not provided/not recorded/error/etc).
With data-level information, you may be able to include enough information in the data files themselves (e.g. in table headings in a spreadsheet). Otherwise, you might want to store a separate explanatory file alongside your data file(s). You could use our template readme.txt file to note all the necessary elements you want to include, perhaps adding to it as you work through the project.
Or you might benefit from writing what’s called a “data dictionary”, basically a legend to your data. This great blog post on data dictionaries explains it in a nutshell, and highlights the benefits, whether sharing data with others (in or outside your project team) or just your future self. And remember that if you’ve entered this information into a tool such as SPSS, you should be able to export it to a file to store with your data (e.g. in SPSS v23, File > Display Data File Information > Working File > print, and print to pdf, or use the File > Export option in many screens to export to a more usable format such as txt).
So, when depositing your data in a repository, remember to include sufficient documentation for others (or your future self!) to understand it so no questions arise.
Image: City map by Mathieu Pellerin, CC-BY-NC-SA 2.0.
Categories & Tags:
Leave a comment on this post:
You might also like…
How to get the most out of our Open Day
Our Open Day is just around the corner, and you’re wondering what you can do to make sure you get the most out of your visit to us. Fear not, I’ve pulled together my top ...
Top tips to help you prepare for our Open Day
Our Open Days offer a fantastic opportunity for you to really get a feel for the community here at Cranfield, and we want you to get the best out of it. We’ve got so much ...
Introducing… the UK Data Service (UKDS)
If you’re looking for a gateway to key economic and social data, then you might like to take a look at the UK Data Service (UKDS). What's included? Offering a wide range of secondary data ...
Creating a bibliography is easy in Mendeley
If you’re using Mendeley Cite, Mendeley's citation plug-in for Word, you will soon be ready to generate a bibliography of all the references you have cited. Here we’ll run through how quick and easy it ...
Why it pays for companies to free female employees from the baby trap
This is an extract from an article originally published in Ethical Corporation magazine. Read the full article. There is growing representation of women on company boards in the UK, with women making up 43.4% ...
Working on your poster presentation?
Are you going to be delivering a poster presentation here at Cranfield in the next few weeks? Looking for guidance to help you make it effective and successful? The format is a unique and wonderful ...