Event report: Why we should all be data quality detectives - workshop held at the RSS 2025 Conference

 

Abstract

This workshop was held on the 4th September 2025 at the Royal Statistical Society (RSS) annual conference in
Edinburgh, and was attended by approximately 40 participants. Its aim was to discuss strategies for elevating
the profile of data quality work among statisticians and data scientists. The structured, round-table discussion
acknowledged that data quality is frequently overlooked due to competing interests, miscommunication, lack of
incentives, and confusion. Key consensus points included the necessity for transparency, a no-blame culture, and prioritising education. Participants were invited to provide contact details to participate in planning next steps.

Introduction

The workshop was held at the 2025 RSS conference in Edinburgh, during a 80 minute session entitled “Why
we should all be data quality detectives”. The workshop was preceded by five short talks which focused on
the often-overlooked yet crucial realm of data quality, and demonstrated how (contrary to common perception)
methods for checking data quality can be really empowering and fun.

The main aim of the workshop was to discuss strategies for elevating the profile of data quality work and
spread the word about its vital importance in the statistical community and beyond.

Methods

The approximately 40 attendees of the workshop were split into 6 groups, given a large sheet of paper and allocated 20 minutes to answer these questions:

  • Why does data quality get overlooked?

  • What strategies can be used to raise the profile of data quality

A representative from each table then summarised and discussed the responses.

Results and Findings

The questions elicited a lively discussion, with the second question (seeking solutions) generating the most input. The answers gathered are detailed in Appendix 1.

Summary of the suggestions

Why does data quality get overlooked?
The main theme running through the answers was organisational lack of a data quality strategy and training, with lack of communication and sometimes confusion about responsibilities. Added to this were time pressures and not enough priority given to data quality tasks, which are often treated with a lack of curiosity.
 
What strategies can be used to raise the profile of data quality?
The discussion yielded two primary recommendations: increased training and greater transparency regarding data quality errors. A suggestion was made for the RSS to showcase cases of data disasters averted (or incurred) – perhaps through a column in Significance. Analysts should be rewarded for quality work and error detection, and encouraged to dedicate time to understand the data’s narrative and challenge the data by embedding best practices from the start.

Conclusions

Key takeaways from the workshop were:
  1. Data quality is given attention but not always the right attention, and it may not be given high enough priority due to lack of a data quality strategy and competing pressures
  2. More education and training is needed, as is a much greater transparency when things go wrong. It is evident that a culture shift is required.

Appendix. List of answers supplied by the workshop participants

Why does data quality get overlooked

  • Data quality is given attention but not always the right attention

  • Important outcomes are sometimes hard to actually know what’s right

  • Data collectors don’t always know what the data will be used for

  • Consultancy work – who owns the data problems?

  • There may have been challenges accessing the data in the first place

  • Reproducibility of outputs can also mean reproduction of errors!

  • If data passes through multiple hands, more steps/chances to go wrong. Need to check at every stage.

  • Often secondary analysts so assume that data has already been checked etc.

  • There are often misconceptions about the time and resource required for quality

  • Publication pressures and competing priorities for resources

  • Lack of curiosity

Strategies to improve and raise the profile of data quality

The suggestions are grouped for clarity.

Improving processes and training

  • Need to start this understanding early (i.e. at school level)

  • Use skills from across organisation and/or get external training to upskill

  • Training would help people understand why their role is so important in research

  • A need for data entry guidance/instructions– seems trivial and can be tedious and so is often overlooked

  • Continuous process improvement

Governance, Strategy, and Culture

  • Need for transparent and a willingness to admit errors

  • Data amnesty – a no blame culture

  • RSS could show good examples of disasters averted

  • Acceptance that mistakes are normal practice

  • Emphasis reputational risk – i.e., for post-publication corrections

  • Publishing with openness and showing limitations

  • Need to stay up to date and use more modern systems

  • Ringfence time to do data quality work as part of the projec

  • We need to properly factor in the time to investigate quality

  • When commissioning new work, build data sharing into contract

  • Feedback to data collectors should be positive and corrective

  • Feedback loops/shared ownership

  • Could there be financial incentives?

  • Can we create some useful metrics?

  • Aim for good quality as a minimum.

  • Consider the narrative from the data collected.

  • Automatic queries are useful in achieving good quality.

  • Use of AI – more eyes needed on real world data

  • Systems that can actually talk to one another!

  • Important to understand the background of the data.

  • Reward for doing quality work and finding errors.

  • Encourage people to challenge the data and learn best practice from the get-go.

  • Need to connect to the lifecycle of the data.

  • Incorporate peer reviews as much as possible – this can take many forms and levels of formality.

  • Keeping metadata and documentation up to date helps new people in the team.

Report written by A. Rosemary Tate, Roger Halliday

Load more