STEPs to working ethically with social media data
11/04/2017

Social media posts can be an appealing source of data for researchers to analyse, such as in studies using location-based posts to map urban activity or using Twitter data to detect epidemic outbreaks. However, just because data is in the public domain doesn’t make it fair game to capture and republish. This was demonstrated last year in a debate after OK Cupid data was published unanonymised and subsequently retracted. Researchers mustn’t only research legally, but also ethically, asking themselves questions around the impact of the research they are doing on the participants involved.
Recently the STEP (Sensitivity, Transparency, Expectation of privacy, Platform) Framework was developed for curating and sharing social media data. It is intended to facilitate open publication of social media data and offers structured guidance around how to work ethically with such data, in order to improve practice and manage risk. Let’s look at an overview.
- Sensitivity:
- Is the information being studied of a sensitive nature? (For example, research may be threatening to subjects if it concerns deeply personal experiences, social control, the interests of powerful persons, or subjects sacred to participants.)
- Are the research subjects from vulnerable populations? If so, this data should also be considered sensitive, whether vulnerable due to developmental problems, social status, age, or neighbourhoods/environments.
- Transparency:
- Is there sufficient documentation to make the data reusable and collection methods transparent? It should cover the data collection methodology, anonymization processes, and ethical considerations, and provide readme file(s) to ensure the data is understandable. It is important for researchers to be transparent about their processes, not only to facilitate data reuse and openness, but also to help foster ‘privacy literacy’ so users can make informed decisions about participating.
- Expectation of privacy:
- Did subjects have an expectation of privacy? While social media posts are in public domains, users (especially private citizens as opposed to politicians, celebrities, or organisations) may not expect their posts to be seen beyond their perceived online community, especially if they are, for example, @-mentions on Twitter. You might also consider the names being used: some sites allow use of pseudonyms where others require real names.
- Was consent obtained for research and/or data sharing? There is no universally agreed rule on the level of consent required for social media research, so it is worth staying current of such developments to inform decisions around publishing data. If consent isn’t obtained, the data may need to be considered more sensitive and access perhaps controlled.
- Are the data (or can the data be) properly anonymised? Most data repositories require submitted data to be de-identified, however this may be difficult and you may be able to argue it is unnecessary if the subjects would not have had an expectation of privacy.
- Platform:
- Are the data in keeping with the policies of the social media platform? Some social media sites’ terms of service limit what can be published. For example, Twitter’s Developer Agreement and Policy states that API users “will only distribute or allow download of Tweet IDs and/or User IDs”; this will also ensure that data would become inaccessible should the Twitter user change their privacy settings or delete a tweet. However, if tweet content is the evidence to your findings and required for reproducibility, you may want to open a dialogue with the platform provider to enable certain data to be shared.
I hope this has given you some food for thought – you can read the full practice paper “Sharing selves” on the MSU repository for more detail including two useful case studies putting the STEP considerations into action. Do you work with social media data and have any other guidance or tips to share?
Public domain image from unsplash.com
Categories & Tags:
Leave a comment on this post:
You might also like…
Connected and Autonomous Vehicle Engineering (Automotive) MSc alumnus Nirmal Jose: My experience at Cranfield and my career so far
Nirmal Jose completed the Connected and Autonomous Vehicle Engineering (Automotive) (CAVE) MSc in 2022. Here he talks about why the automotive industry is such an exciting place to be, his experience of studying at Cranfield, ...
A Day at the Explorers Festival
An Explorer's Spark “There is an explorer’s ‘spark’ that is felt at these events”, said National Geographic CEO Jill Tiefenthaler at the Explorers Festival in London on April 20; it was a spark felt by ...
How do I reference… a newspaper article in the APA7 style?
If you're using newspaper content in your work, you may be wondering how to reference it. Is it exactly the same as a journal article reference? Well, it's pretty similar. Here's a short guide. To ...
Resource trial: Writefull
Throughout June, Library Services are running a trial for Writefull which provides tools to help with academic writing. Writefull's support includes proofreading, spelling and grammar checking your work. It can also help you craft your ...
Working smarter, cleaner and greener: The future of manufacturing and materials
Everything we own, use or interact with in life starts with a material or materials. But even we in the industry admit that materials are a huge part of what has led us to ...
Hubert Ovie Madise: My group design project
'Hubert Ovie Madise, what have you been up to the past ten weeks?' The Cranfield School of Water, Energy and Environment (SWEE) Group Design Project (GDP) module - that's what! The SWEE GDP module ...