Discussion meetings are events where articles ('papers for reading') appearing in the Journal of the RSS are presented and discussed. The discussion and authors' replies are then published in the relevant Journal series.
Read more about our discussion meetings, including guidelines for papers for discussion.
Contact Judith Shorten if you would like to make a written contribution to a discussion meeting or join our mailing list for an early invitation to future meetings.
Next Discussion meeting
Statistical Aspects of the Covid-19 Pandemic
2nd Multi-paper Discussion Meeting
Thursday, June 16, 2022, 4-6pm BST
Taking place at the Royal Statistical Society, Errol Street, London EC1Y 8LX
Organised by the RSS Discussion Meetings Committee
Paper 1: ‘Bayesian semi-mechanistic modelling of COVID-19: identifiability, sensitivity, and policy implications’
Download the preprint.
: Samir Bhatt, Neil Ferguson, Seth Flaxman, Axel Gandy, Swapnil Mishra, James Scott
We propose a general Bayesian approach to modeling epidemics such as COVID-19. The approach grew out of specific analyses conducted during the pandemic, in particular an analysis concerning the effects of non-pharmaceutical interventions (NPIs) in reducing COVID-19 transmission in 11 European countries. The model parameterizes the time varying reproduction number Rt through a regression framework in which covariates can e.g. be governmental interventions or changes in mobility patterns. This allows a joint fit across regions and partial pooling to share strength. This innovation was critical to our timely estimates of the impact of lockdown and other NPIs in the European epidemics, whose validity was borne out by the subsequent course of the epidemic. Our framework provides a fully generative model for latent infections and observations deriving from them, including deaths, cases, hospitalizations, ICU admissions and seroprevalence surveys. One issue surrounding our model’s use during the COVID-19 pandemic is the confounded nature of NPIs and mobility. We use our framework to explore this issue. We have open sourced an R package epidemia implementing our approach in Stan. Versions of the model are used by New York State, Tennessee and Scotland to estimate the current situation and make policy decisions.
Paper 2: ‘A sequential Monte Carlo approach for estimation of timevarying reproduction numbers for Covid-19’
Download the preprint.
Authors: Geir Storvik, Alfonso Diz-Lois Palomares, Solveig Engebretsen, Gunnar Rø, Kenth Engo-Monsen, Anja Kristoffersen, Birgitte De Blasio, Arnoldo Frigessi
The Covid-19 pandemic has required most countries to implement complex sequences of non-pharmaceutical interventions, with the aim of controlling the transmission of the virus in the population. To be able to take rapid decisions, a detailed understanding of the current situation is necessary. Estimates of time-varying, instantaneous reproduction numbers represent a way to quantify the viral transmission in real time. They are often defined through a mathematical compartmental model of the epidemic, like a stochastic SEIR model, whose parameters must be estimated from multiple time series of epidemiological data. Because of very high dimensional parameter spaces (partly due to the stochasticity in the spread models) and incomplete and delayed data, inference is very challenging. We propose a state space formalisation of the model and a sequential Monte Carlo approach which allow to estimate a daily-varying reproduction number for the Covid-19 epidemic in Norway with sufficient precision, on the basis of daily hospitalisation and positive test incidences. The method was in regular use in Norway during the pandemics and appears to be a powerful instrument for epidemic monitoring and management.
Past Discussion Meetings
Paper: ‘Vintage Factor Analysis with Varimax Performs Statistical Inference’
Authors: Karl Rohe and Muzhe Zeng, University of Wisconsin-Madison, USA
Took place on Wednesday, 11 May 2022 3-5pm (BST)
Psychologists developed Multiple Factor Analysis to decompose multivariate data into a small number of interpretable factors without any a priori knowledge about those factors [Thurstone, 1935]. In this form of factor analysis, the Varimax factor rotation redraws the axes through the multidimensional factors to make them sparse and thus make them more interpretable [Kaiser, 1958].
Charles Spearman and many others objected to factor rotations because the factors seem to be rotationally invariant [Thurstone, 1947, Anderson and Rubin, 1956]. These objections are still reported in all contemporary multivariate statistics textbooks. However, this vintage form of factor analysis has survived and is widely popular because, empirically, the factor rotation often makes the factors easier to interpret. We argue that the rotation makes the factors easier to interpret because, in fact, the Varimax factor rotation performs statistical inference.
We show that Principal Components Analysis (PCA) with the Varimax axes provides a unified spectral estimation strategy for a broad class of semi-parametric factor models, including the Stochastic Blockmodel and a natural variation of Latent Dirichlet Allocation (ie 'topic modeling'). In addition, we show that Thurstone’s widely employed sparsity diagnostics implicitly assess a key leptokurtic condition that makes the axes statistically identifiable in these models. Taken together, this shows that the know-how of Vintage Factor Analysis performs statistical inference, reversing nearly a century of statistical thinking on the topic. We illustrate these techniques use on two large bibliometric examples (a citation network and a text corpus). With a sparse eigensolver, PCA with Varimax is both fast and stable. Combined with Thurstone’s straightforward diagnostics, this vintage approach is suitable for a wide array of modern applications.
Download the preprint
The paper will be published in the Journal of the Royal Statistical Society, Series B.
Paper: 'Experimental Evaluation of Algorithm-Assisted Human Decision-Making: Application to Pretrial Public Safety Assessment'
Authors: Imai et al.
Tuesday 8 February, 2022
To be published in JRSSA.
Despite an increasing reliance on fully-automated algorithmic decision-making in our lives, human beings still make consequential decisions. We develop a statistical methodology for experimentally evaluating the causal impacts of algorithmic recommendations on human decisions. We also show how to examine whether algorithmic recommendations improve the fairness of human decisions and derive the optimal decision rules under various settings. We apply the proposed methodology to preliminary data from the first-ever randomized controlled trial that evaluates the pretrial Public Safety Assessment (PSA) in the criminal justice system. A goal of the PSA is to help judges decide which arrested individuals should be released. We find that providing the PSA to the judge has little overall impact on the judge's decisions and subsequent arrestee behavior. However, we find that the PSA may help avoid unnecessarily harsh decisions for female arrestees while it encourages the judge to make stricter decisions for male arrestees who are deemed to be risky. For fairness, the PSA appears to increase the gender bias against males while having little effect on any existing racial differences in judges' decision. Finally, we find that the PSA's recommendations might be unnecessarily severe unless the cost of a new crime is sufficiently high.
Download the preprint
Statistical aspects of the Covid-19 pandemic
A multi-paper meeting featuring three discussion papers
Took place at Manchester Central (at the RSS 2021 Conference) and online
8 September 2021, 5.15-7.15pm (BST)
Chair: RSS President Sylvia Richardson
Paper 1: 'Modeling the Covid-19 infection trajectory: a piecewise linear quantile regression'
Authors: F Jiang (Fudan University, Shanghai, China); Z Zhao, (University of Notre Dame, Indiana, USA); X Shao (University of Illinois Urbana-Champaign, USA)
We propose a piecewise linear quantile trend model to analyse the trajectory of the Covid-19 daily new cases (ie the infection curve) simultaneously across multiple quantiles. The model is intuitive, interpretable and naturally captures the phase transitions of the epidemic growth rate via change-points. Unlike the mean trend model and least squares estimation, our quantile-based approach is robust to outliers, captures heteroscedasticity (commonly exhibited by Covid-19 infection curves) and automatically delivers both point and interval forecasts with minimal assumptions. Building on a self-normalized (SN) test statistic, this paper proposes a novel segmentation algorithm for multiple changepoint estimation. Theoretical guarantees such as segmentation consistency are established under mild and verifiable assumptions. Using the proposed method, we analyse the Covid-19 infection curves in 35 major countries and discover patterns with potentially relevant implications for effectiveness of the pandemic responses by different countries. A simple change-adaptive two-stage forecasting scheme is further designed to generate short-term prediction of Covid-19 cumulative new cases and is shown to deliver accurate forecast valuable to public health decision-making.
Download the preprint
Paper 2: 'Quantifying the economic response to Covid-19 mitigations and death rates via forecasting Purchasing Managers' Indices using Generalised Network Autoregressive models with exogenous variables'
Authors: G Nason & J Wei, Imperial College London, UK
Knowledge of the current state of economies, how they respond to Covid-19 mitigations and indicators, and what the future might hold for them is important. We use recently-developed generalised network autoregressive (GNAR) models, using trade determined networks, to model and forecast the Purchasing Managers’ Indices for a number of countries. We use networks that link countries where the links themselves, or their weights, are determined by the degree of export trade between the countries. We extend these models to include node-specific time series exogenous variables (GNARX models), using this to incorporate Covid-19 mitigation stringency indices and Covid-19 death rates into our analysis. The highly parsimonious GNAR models considerably outperform vector autoregressive models in terms of mean-squared forecasting error and our GNARX models themselves outperform GNAR ones. Further mixed frequency modelling predicts the extent to which that the UK economy will be affected by harsher, weaker or no interventions
Download the preprint
Paper 3: 'Small Data, Big Time - A retrospect of the first weeks of Covid-19'
Author: Q Zhao, Statistical Laboratory, University of Cambridge, UK
This article reviews some early investigations and research studies in the first weeks of the coronavirus disease 2019 (Covid-19) pandemic from a statistician’s perspective. These investigations were based on very small datasets but were momentous in the initial global reactions to the pandemic. The article discusses the initial evidence of high infectiousness of Covid-19 and why that conclusion was not reached faster than in reality. Further reanalyses of some published Covid-19 studies show that the epidemic growth was dramatically underestimated by compartmental models, and the lack of fit could have been clearly identified by simple data visualization. Finally, some lessons for statisticians are discussed.
Download the preprint
View our playlist of recent Discussion Meetings
Read past Discussion Papers
Assumption-lean inference for generalised linear model parameters
Tuesday 6 July 2021
The Discussion Paper was presented by the authors, Stijn Vansteelandt and Oliver Dukes, Ghent University, Belgium, and chaired by Guy Nason.
To be published in Series B; for more information go to the Wiley Online Library.
Download the preprint (PDF).