Anonymising your research data
17/03/2017

In our last RDM blog post we discussed data protection in general; now let’s take a further look at anonymisation. The main message is to think about it early, planning well to ensure that data is safe, secure, and handled in line with the Data Protection Act and any funder expectations. Here are a few questions to ask yourself…
Is your consent right? Of course, one aspect is making sure your consent form covers which data will be shared and which destroyed; you must be careful with the wording. “Anonymised” data means it is impossible to identify an individual from that dataset, or from that dataset in conjunction with other data available. It’s not easy, so we must be careful about terms such as “anonymous” or “de-identified” and making guarantees. The ICPSR recommendations suggest stating exactly which data elements will be removed before sharing.
How will you collect and store personal data? Make sure you’re only collecting what is absolutely necessary, and that only authorised users have access to it. This may mean storing it on a device with a login (e.g. on your University network drive). Remember that if you make a backup on an external hard drive, for example, that device should also have a login.
If you’re collecting any sensitive personal data, you must also use encryption, in line with ICO recommendations; contact the IT Service Desk for installation and instructions on our encryption software. The definition of “sensitive personal data” is personal data relating to someone’s race/ethnicity, political or religious beliefs, trade union membership, physical/mental health, sexual life, or offences (committed or alleged).
Personal data, especially sensitive data, should not be stored on unsecured devices – mistakes can harm individuals and also be expensive (a police force was fined £120,000 for saving personal data on a memory stick which was then stolen!).
Which data needs sharing? Does your research have any funding from bodies that require you to preserve/share your data at the end of the project? For example, RCUK councils expect data to be preserved for usually 10 years since their last use, and shared as openly as possible, with any restrictions outlined in the early stages of the project and fully justified. Often, personal data can be shared as long as it is sufficiently anonymised; the UK Data Archive (usually the repository for ESRC projects) also offers a Secure Lab option, for controlled access to data that is not able to be shared openly.
Are you deleting data securely? As soon as research no longer requires the identifiable portions of data, these should be removed and future research should use de-identified or anonymised data. The version containing personal information (the key), may either need destroying, or retaining securely. If you are destroying it, contact the IT Service Desk for secure erasure, as just deleting a file will not prevent it being accessible on a device. Whilst the responsibility to ensure deletion of personal research data ultimately lies with the head of department, the nominated data manager for the project may be assigned the task of arranging deletion, depending on responsibilities laid out in your data management plan.
What anonymisation steps will you take? The UK Anonymisation Network has an excellent resource if you’re new to anonymisation: the anonymisation decision-making framework (pdf). You’ll probably need to think about removing direct identifiers (name, address, photos…) as well as removing/editing indirect identifiers, which can identify individuals when combined with other data (e.g. occupation and workplace). Some methods are:
-
Reducing precision: e.g. instead of birth dates, just give birth years.
-
Aggregating data: e.g. instead of job titles, use occupational categories (the ONS have a Standard Occupational Classification which can help); or instead of city, use geographical area.
-
Hide outliers: e.g. use income or age bands that do not highlight highest or lowest values, as these exceptional values can often identify individuals.
Audio-visual anonymisation is harder, especially if trying to automate it, as Google demonstrated recently when blurring out a cow’s face! It is preferable to obtain informed consent for reuse of audio-visual material, than to choose the route of blurring images or altering audio. However, you also need to be clear on how long the data will be kept for, and the process for withdrawing consent for the data to be reused.
Finally, is it worth anonymising your data? Let’s not forget that sometimes it isn’t! If the anonymisation work would be expensive, time-consuming, and greatly reduce the usefulness of the data, it is probably not worth doing. You would then need to consider whether the unanonymised version can be retained or reused under stricter access conditions. The UKAN framework compares data security to house security: if you make a fully secure house, with no doors or windows, its usefulness is reduced to nothing – there is always a balance to be found.
Public domain image from unsplash.com
Categories & Tags:
Leave a comment on this post:
You might also like…
What opportunities are offered by vehicle connectivity and automation on highway?
Over the past decade, the number of road vehicles connected with the outside world has been growing rapidly, in response to the need for real-time navigation data, the introduction of eCall requirements, the practicality ...
Download the Library app!
You can use it to: Check what we have in our print collection (and a selection of our eBooks too!) Borrow items directly with the app instead of using a self-service machine* Reserve and renew ...
Why I chose Cranfield
After completing an undergraduate degree in Internet of Things Engineering, Zhen Sun was working with Siemens and looking for an area to give his focus to. He seized the opportunity to upgrade his skillset ...
Bloomberg basics
Accessible in SOM Library from 12 dedicated terminals, Bloomberg provides access to the real world of finance through the same platform as used by the world’s leading banks, corporations and government agencies. Bloomberg provides real-time ...
Five tips to help you start your Cranfield journey
Starting your postgraduate study journey can be a roller coaster of emotions; as exciting as it is nerve wracking. While there’s lots to look forward to, it’s also normal to be apprehensive of what ...
How do I reference… YouTube, TikTok and other audio-visual material in the APA7 style?
Have you ever wondered how to include a reference to audio-visual material such as a video posted on YouTube or TikTok? If you have, you're in the right place! Referencing audio-visual material is as straightforward ...