Case study B: The Charity Consultant
[This case study has been anonymised to avoid disclosing identifiable details about the specific projects referenced here and to protect the privacy of the relevant organisations.]
Researcher B is a project manager for a small charity that offers data consultancy services to other organisations in the third sector. They are responsible for providing end-to-end support to clients on diverse data-driven analytical projects, typically over six-month timelines. That support usually involves linking client charities with volunteers sourced from the private sector with an eye towards providing practical answers to operational or strategic questions.
Clients of Researcher B often need to use data to inform their service targeting or enable their impact evaluations. A recurrent challenge comes up for many of Researcher B’s projects: publicly available data tends to be missing combinations of variables that are necessary to answer relatively straightforward questions about their target groups.
In one case, a London-based social mobility charity supporting disadvantaged young people from a particular ethnic background wanted to understand how their beneficiaries compared to similar young people across London. The aim was to construct a basic comparator group defined by ethnicity, location, age, and deprivation status, then compare educational and occupational outcomes for that group against the client’s service users own outcomes.
Researcher B identified that the Longitudinal Education Outcomes (LEO) dataset could have enabled this analysis. However, although they are an accredited researcher, access was not feasible in this case because it would involve a lengthy application process since LEO is held in a Trusted Research Environment (TRE) with strict access controls. Waiting months at a time for access to be granted was unworkable in this case (and many other projects led by Researcher B) for all of the following reasons:
- Most of the project timeline would have elapsed before approval could have been given;
- Researcher B’s organisation specialises in working with charities that lack technical expertise and confidence in data or that are less familiar with the ways that data can add value to their work. It is difficult to persuade executives at such organisations that a lengthy process will pay off, and this is exacerbated by inherent uncertainty about whether applications by Researcher B’s organisation would even be approved;
- Researcher B’s team themselves did not necessarily have the capacity to act as the intermediary for the complicated and resource-intensive process.
As a workaround, Researcher B considered the possibility that client charities asking similar questions might be able to “pool” their request such that multiple projects could be serviced by a single application for the data. Ultimately, however, they were deterred from taking such an approach because of clients timeline constraints (as outlined above) and the observation that charity funders are unlikely to support speculative, open-ended analytical proposals that may not materialise.
Instead of using LEO data, the project relied on publicly available Department for Education and labour market data. However, these sources did not allow the necessary cross-tabulations of the client’s desired characteristics. Ethnicity categories were often aggregated into a “non-white” category, and Researcher B found that the analysis needed “just one more variable” beyond what was publicly available. In the absence of linked, granular data, the comparator analysis was scaled back to simpler descriptive breakdowns that were less precise, informative, or robust than what otherwise would have been possible. For the client charity, this meant their assessment of the impact of their interventions was severely limited.
This sort of experience is common in Researcher B’s experience. A similar pattern emerged in work with a youth homelessness charity seeking to identify individuals at risk of homelessness in specific areas and among particular subgroups (e.g. by ethnicity, disability, and deprivation). Researcher B described government homelessness data as patchy, with limited coverage and insufficient intersectional breakdowns. As a result, the project relied heavily on other charities’ administrative data rather than official statistics, constraining the robustness and comparability of findings.
In summary, these examples reflect a broader structural mismatch between civil society needs and current data access arrangements. The analytical questions posed by charities are often not methodologically complex. However, they typically require the ability to combine multiple characteristics at a granular geographic level. Public datasets may contain relevant variables in isolation, but not in the combinations required for targeted intervention design or meaningful benchmarking.
Accredited access routes to linked administrative data exist, but they are time and resource intensive, uncertain to succeed, and poorly aligned with the pace and funding models of frontline organisations. Even when an intermediary has accredited researchers on staff (as in the case of Researcher B), the costs and delays associated with secure access can render projects unviable. Smaller organisations are therefore dependent on either scaling back their ambitions or proceeding with less robust analyses.
The result is a persistent evidence gap: operational charities, even when supported by volunteer data professionals, are often unable to benchmark their work against comparable populations or understand how outcomes differ across specific subgroups in their area. This limits their ability to target services effectively, demonstrate impact credibly, and design interventions informed by a full picture of local need.