How do you determine data authorship?
05/12/2016

When a dataset is published online, the depositor must enter some information about it, including providing details about its authors. It seems an innocuous field to fill in, but determining data authorship can be complex – how do you choose which team members to list?
Well, what are the implications of data authorship? The data authors are the people who will be credited when the dataset is cited in publications or other further works, so it is important that all the people who deserve such credit are included. The authors might also be contacted if inaccuracies are found or there are queries about the data that need answering, so they should have the necessary familiarity with the work.
So who should be listed as data authors if there is a large research team? Everyone? Just the person who collected the data? What about the person who, for example, designed the questionnaire or experiment? Or did the quality checks on the data? Or spent time formatting and documenting it for it to be shared for reuse? Or a senior colleague whose contribution in project conception, oversight and steering was crucial, but who never did any day-to-day data work?
Unfortunately, this post is not going to answer those questions, as there are no hard and fast rules. Generally, it is important that anyone who made a significant intellectual or practical contribution to the data’s creation is credited, but how this principle is applied may vary. Two frequent cases are whether or not to credit people who have done legwork in data collection, but made no intellectual contribution to it, and whether or not to include those who steered the project but never worked with the data. It can be useful to consider what is standard in your domain, and any guidance from publishers or funders. In your data management plan, you should be setting out which datasets will be published and where, so this may also be a good opportunity to clarify who will be listed as the authors, to avoid any delays to publication if there are queries over who to name.
For further discussion of the issues, the Research Data Alliance and CODATA have just published a report on the Principles and Implementation Guidelines for Legal Interoperability of Research Data (pdf) with Principle Six (starting p.25) particularly relevant to the discussion on authorship. Do you think these are useful guidelines? Are there standard practices in your research community around data authorship, or are there questions that need discussing?
Image: Rock, Paper, Scissors by Jesse Kruger, CC-BY-NC 2.0, at https://www.flickr.com/photos/jessekruger/464375923/
Categories & Tags:
Leave a comment on this post:
You might also like…
How do I reference… a table of data from multiple sources?
If you have read our previous APA7 post on Referencing ... tables, you will know how to cite a table of data taken from another source, but when you are creating a new table which ...
Finding full-text Economist articles…
If you’re looking for The Economist, the place to go is ProQuest One Business. Follow these step-by-step instructions to get full-text access. Login here and click on the Publications option at the top, above the ...
Changes to Library Services over Easter, 18-21 April
Libraries on the Cranfield site Both Kings Norton Library and the School of Management Library (Building 111, first floor) will be open 24/7 over the Easter weekend. You will be able to use the study ...
Searching Statista: Effective strategies and Research AI tips
Statista is a global data and business intelligence platform with an extensive collection of statistics, reports, and insights on over 80,000 topics from 22,500 sources in 170 industries. It offers data on the global digital ...
Introducing…. BankFocus (Orbis)
For anyone researching the financial sector, BankFocus is a great place to start, providing financial and company data for finance institutions and companies from across the world. The service allows you to search for a ...
The Implications of US Tariffs on global supply chains
US President Donald Trump's new tariff policies announced on April 2, 2025 are expected to cause significant disruptions to the global supply chains, affecting multiple sectors and countries. A simple mathematical equation uses a country’s ...