Anonymising your research data
17/03/2017
In our last RDM blog post we discussed data protection in general; now let’s take a further look at anonymisation. The main message is to think about it early, planning well to ensure that data is safe, secure, and handled in line with the Data Protection Act and any funder expectations. Here are a few questions to ask yourself…
Is your consent right? Of course, one aspect is making sure your consent form covers which data will be shared and which destroyed; you must be careful with the wording. “Anonymised” data means it is impossible to identify an individual from that dataset, or from that dataset in conjunction with other data available. It’s not easy, so we must be careful about terms such as “anonymous” or “de-identified” and making guarantees. The ICPSR recommendations suggest stating exactly which data elements will be removed before sharing.
How will you collect and store personal data? Make sure you’re only collecting what is absolutely necessary, and that only authorised users have access to it. This may mean storing it on a device with a login (e.g. on your University network drive). Remember that if you make a backup on an external hard drive, for example, that device should also have a login.
If you’re collecting any sensitive personal data, you must also use encryption, in line with ICO recommendations; contact the IT Service Desk for installation and instructions on our encryption software. The definition of “sensitive personal data” is personal data relating to someone’s race/ethnicity, political or religious beliefs, trade union membership, physical/mental health, sexual life, or offences (committed or alleged).
Personal data, especially sensitive data, should not be stored on unsecured devices – mistakes can harm individuals and also be expensive (a police force was fined £120,000 for saving personal data on a memory stick which was then stolen!).
Which data needs sharing? Does your research have any funding from bodies that require you to preserve/share your data at the end of the project? For example, RCUK councils expect data to be preserved for usually 10 years since their last use, and shared as openly as possible, with any restrictions outlined in the early stages of the project and fully justified. Often, personal data can be shared as long as it is sufficiently anonymised; the UK Data Archive (usually the repository for ESRC projects) also offers a Secure Lab option, for controlled access to data that is not able to be shared openly.
Are you deleting data securely? As soon as research no longer requires the identifiable portions of data, these should be removed and future research should use de-identified or anonymised data. The version containing personal information (the key), may either need destroying, or retaining securely. If you are destroying it, contact the IT Service Desk for secure erasure, as just deleting a file will not prevent it being accessible on a device. Whilst the responsibility to ensure deletion of personal research data ultimately lies with the head of department, the nominated data manager for the project may be assigned the task of arranging deletion, depending on responsibilities laid out in your data management plan.
What anonymisation steps will you take? The UK Anonymisation Network has an excellent resource if you’re new to anonymisation: the anonymisation decision-making framework (pdf). You’ll probably need to think about removing direct identifiers (name, address, photos…) as well as removing/editing indirect identifiers, which can identify individuals when combined with other data (e.g. occupation and workplace). Some methods are:
-
Reducing precision: e.g. instead of birth dates, just give birth years.
-
Aggregating data: e.g. instead of job titles, use occupational categories (the ONS have a Standard Occupational Classification which can help); or instead of city, use geographical area.
-
Hide outliers: e.g. use income or age bands that do not highlight highest or lowest values, as these exceptional values can often identify individuals.
Audio-visual anonymisation is harder, especially if trying to automate it, as Google demonstrated recently when blurring out a cow’s face! It is preferable to obtain informed consent for reuse of audio-visual material, than to choose the route of blurring images or altering audio. However, you also need to be clear on how long the data will be kept for, and the process for withdrawing consent for the data to be reused.
Finally, is it worth anonymising your data? Let’s not forget that sometimes it isn’t! If the anonymisation work would be expensive, time-consuming, and greatly reduce the usefulness of the data, it is probably not worth doing. You would then need to consider whether the unanonymised version can be retained or reused under stricter access conditions. The UKAN framework compares data security to house security: if you make a fully secure house, with no doors or windows, its usefulness is reduced to nothing – there is always a balance to be found.
Public domain image from unsplash.com
Categories & Tags:
Leave a comment on this post:
You might also like…
My Cranfield Journey: A Global Product Development Adventure
Hi everyone! My name is Salma Aboujaafar, and I’ve just completed my MSc in Global Product Development and Management (GPD&M). I’m Moroccan, but I’m currently based in France, and my studies ...
My Journey in Aerospace: From Taiwan to Cranfield
Meet Mei-Ying Teng, a recent Aerospace Computational Engineering MSc graduate. Originally from Taiwan, Mei’s passion for aerospace research led her to choose Cranfield for its unique focus in the field. Hi ...
Changes to the Factiva interface
The eagle-eyed amongst you may have noticed that the Factiva homepage has changed and we are no longer taken directly to the search forms that we traditionally use. To access these, you need to open ...
A Deep Dive into Cranfield’s MSc in Management and Information Systems
Elena Cuatrecasas Schmitz graduated with a master’s degree in Management and Information Systems in 2023. The Spanish-born student now resides in Barcelona and shares her transformative academic journey. In 2023, I ...
My Cranfield Adventure: From Italy to the Global Manufacturing Stage
Alessia Paoletti, a recent graduate of the Engineering and Management of Manufacturing Systems (EMMS) MSc programme at Cranfield University, shares her transformative academic journey. I recently completed the Engineering and Management ...
New edition of the APA7 Author-Date referencing guide published
We have issued a second edition of the APA7 Author-Date referencing guide. The updated edition contains an enhanced introduction written in association with the academic language support team. It includes guidance on why and when ...