Event report: Estimating incidence of co-infection

The West Midlands Local Group was pleased to host an event with Dr Simon Spencer, a reader in statistics at the University of Warwick, on estimating incidence of co-infection from emergency department serosurvey data. 
Simon introduced his colleagues, including Oliver Baedeker (OL), the subject area expert who answered many questions about the origin and quality of data. 
The viruses which OL is interested in are Hepatitis C (HCV) which is blood borne, HIV which is sexually transmitted (STD) and blood borne, and Herpes simplex virus type 2 (HSV) which is a STD. The aim is to use understanding of HCV and HSV  to understand changes in behaviour, and how co-infection affects transmission dynamics. Estimates of incidence from prevalence, change in incidence with change in time by age, and the why risk HIV is changing are needed.  
The data are from patients at the emergency service of Johns Hopkins hospital in Baltimore, Maryland, which serves a marginalised inner-city population. As well as sex, age, ethnicity, blood samples provide disease status in 2003, 2007, 2013, 2016. The disease prevalences in this population are high, 20% to 30%. There is a shift in prevalence to older ages across surveys. This might be ageing, or a change in risk of infection. 
 The co-infection model is shown by a cube, with combinations at the corners, and links on edges. Start from no disease, can progress eventually to all three diseases. When infected by one virus, there is an association with methods of transmission, and other covariates. So, given one disease, the transition probabilities are modified accordingly. The differential equations for change also take into account the death rates from the conditions. 
The mathematical models need to connect to the data, and epidemiology, which means age, sex and ethnicity and the different times. A multinomial model was used, with numbers in age-cohorts by year. The data structure means that one has to fit the model separately to age, sex and ethnicity (black and white) combinations. In fact, there is not enough data, although about a thousand per survey, if each year of age (54 ages) has a separate parameter. Hence smoothed by age and time, using Bayesian methods. Also, the model is unidentifiable, as one cannot identify both the infection rates and death rates. So death rates were taken from the literature, which is not easy. The comparisons between groups on incidence will be fairly robust, but the absolute values are less certain.  
There are eight ordinary differential equations, and 1455 parameters for each ethnicity and sex combination. So the computing time was very substantial. The results provide quite nice incidence surfaces. The year of birth is important, but not year of study, so behaviour needs to be addressed early. The various interpretations of several interesting graphs was discussed. 
Load more