"Using linked administrative data to aid the handling of non-response and restore sample representativeness in cohort studies"

Date: Friday 21 April 2023, 2.00PM
Location: Online via Zoom
https://plymouth.zoom.us/j/99739909702
Local Group Meeting


Share this event

There is growing interest in whether linked administrative data have the potential to aid analyses subject to missing data in cohort studies. Using linked 1958 National Child Development Study (NCDS) and Hospital Episode Statistics (HES) data, we applied a LASSO variable selection approach to identify HES variable which are predictive of non-response at the age 55 sweep of NCDS. We then included these variables as auxiliary variables in multiple imputation (MI) analyses to explore the extent to which they helped restore sample representativeness of the respondents together with the imputed non-respondents in terms of early life variables (mother’s husband’s social class at birth, cognitive ability at age 7) and relative to external population benchmarks (educational qualifications at age 55, marital status at age 55). We identified 10 HES variables that were predictive of non-response at age 55 in NCDS.   For example, cohort members who had been treated for adult mental illness were more than 70% more likely to be non-respondents (risk ratio 1.73; 95% confidence interval 1.17, 2.51). Inclusion of these HES variables in MI analyses only helped to restore sample representativeness to a limited extent. Furthermore, there was essentially no additional gain in sample representativeness relative to analyses using only previously identified survey predictors of non-response (i.e. NCDS rather than HES variables). Since we are some of the first people to use this linked data resource, I will also take a detour into examining the linkage quality and sample representativeness.

Venue: https://plymouth.zoom.us/j/99739909702

 
 Dr Richard Silverwood (University College London)
 
Contact Malgorzata Wojtys