Discussion meetings

Discussion meetings are events where articles ('papers for reading') appearing in the Journal of the RSS are presented and discussed. The discussion and authors' replies are then published in the relevant Journal series. 

Read more about our discussion meetings, including guidelines for papers for discussion.

Contact Judith Shorten if you would like to make a written contribution to a discussion meeting or join our mailing list for an early invitation to future meetings.

Next Discussion meeting

Statistical Aspects of Climate Change

Wednesday, September 14, 2022, 5-7pm BST
Taking place at the RSS Conference, Aberdeen

A multi-paper meeting featuring two discussion papers and organised by the RSS Discussion Meetings Committee and RSS Environmental Statistics Section

Book here to attend online 
Note: Registration details for in-person attendance to follow soon.

Paper 1: ‘Assessing present and future risk of water damage using building attributes, meteorology and topography’
Download the preprint.

Authors: Claudio Heinrich-Mertsching*, Jens Christian Wahl*, Alba Ordonez*, Marita Stien#, John Elvsborg#, Ola Haug*, Thordis L. Thorarinsdottir*
* Norwegian Computing Center, Oslo, Norway
# Gjensidige Forsikring ASA, Oslo, Norway

Weather-related risk makes the insurance industry inevitably concerned with climate and climate change. Buildings hit by pluvial flooding is a key manifestation of this risk, giving rise to compensations for the induced physical damages and business interruptions. In this work, we establish a nationwide, building-specific risk score for water damage associated with pluvial flooding in Norway. We fit a generalized additive model that relates the number of water damages to a wide range of explanatory variables that can be categorized into building attributes, climatological variables and topographical characteristics. The model assigns a risk score to every location in Norway, based on local topography and climate, which is not only useful for insurance companies, but also for city planning. Combining our model with an ensemble of climate projections allows us to project the (spatially varying) impacts of climate change on the risk of pluvial flooding towards the middle and end of the 21st century

Paper 2: 'The importance of context in extreme value analysis with application to extreme temperatures in the USA and Greenland'
Download the preprint.

Authors: Daniel Clarkson, Emma Eastoe and Amber Leeson, University of Lancaster, UK

Statistical extreme value models allow estimation of the frequency, magnitude and spatio-temporal extent of extreme temperature events in the presence of climate change. Unfortunately, the assumptions of many standard methods are not valid for complex environmental data sets, with a realistic statistical model requiring appropriate incorporation of scientific context. We examine two case studies in which the application of routine extreme value methods result in inappropriate models and inaccurate predictions. In the first scenario, record-breaking temperatures experienced in the US in the summer of 2021 are found to exceed the maximum feasible temperature predicted from a standard extreme value analysis of pre-2021 data. Incorporating random effects into the standard methods accounts for additional variability in the model parameters, reflecting shifts in unobserved climatic drivers and permitting greater accuracy in return period prediction. The second scenario examines ice surface temperatures in Greenland. The temperature distribution is found to have a poorly-defined upper tail, with a spike in observations just below 0◦C and an unexpectedly large number of measurements above this value. A Gaussian mixture model fit to the full range of measurements is found to improve fit and predictive abilities in the upper tail when compared to traditional extreme value methods.

The papers will be published in the Journal of the Royal Statistical Society: Series C (Applied Statistics) - Wiley Online Library

Past Discussion Meetings

Statistical Aspects of the Covid-19 Pandemic 
2nd Multi-paper Discussion Meeting

Took place Thursday, June 16, 2022, 4-6pm BST

Paper 1: ‘Bayesian semi-mechanistic modelling of COVID-19: identifiability, sensitivity, and policy implications’
Download the preprint.

Authors: Samir Bhatt, Neil Ferguson, Seth Flaxman, Axel Gandy, Swapnil Mishra, James Scott

We propose a general Bayesian approach to modeling epidemics such as COVID-19. The approach grew out of specific analyses conducted during the pandemic, in particular an analysis concerning the effects of non-pharmaceutical interventions (NPIs) in reducing COVID-19 transmission in 11 European countries. The model parameterizes the time varying reproduction number Rt through a regression framework in which covariates can e.g. be governmental interventions or changes in mobility patterns. This allows a joint fit across regions and partial pooling to share strength. This innovation was critical to our timely estimates of the impact of lockdown and other NPIs in the European epidemics, whose validity was borne out by the subsequent course of the epidemic. Our framework provides a fully generative model for latent infections and observations deriving from them, including deaths, cases, hospitalizations, ICU admissions and seroprevalence surveys. One issue surrounding our model’s use during the COVID-19 pandemic is the confounded nature of NPIs and mobility. We use our framework to explore this issue. We have open sourced an R package epidemia implementing our approach in Stan. Versions of the model are used by New York State, Tennessee and Scotland to estimate the current situation and make policy decisions.

Paper 2: ‘A sequential Monte Carlo approach for estimation of timevarying reproduction numbers for Covid-19’
Download the preprint.

Authors: Geir Storvik, Alfonso Diz-Lois Palomares, Solveig Engebretsen, Gunnar Rø, Kenth Engo-Monsen, Anja Kristoffersen, Birgitte De Blasio, Arnoldo Frigessi

The Covid-19 pandemic has required most countries to implement complex sequences of non-pharmaceutical interventions, with the aim of controlling the transmission of the virus in the population. To be able to take rapid decisions, a detailed understanding of the current situation is necessary. Estimates of time-varying, instantaneous reproduction numbers represent a way to quantify the viral transmission in real time. They are often defined through a mathematical compartmental model of the epidemic, like a stochastic SEIR model, whose parameters must be estimated from multiple time series of epidemiological data. Because of very high dimensional parameter spaces (partly due to the stochasticity in the spread models) and incomplete and delayed data, inference is very challenging. We propose a state space formalisation of the model and a sequential Monte Carlo approach which allow to estimate a daily-varying reproduction number for the Covid-19 epidemic in Norway with sufficient precision, on the basis of daily hospitalisation and positive test incidences. The method was in regular use in Norway during the pandemics and appears to be a powerful instrument for epidemic monitoring and management.

Paper: ‘Vintage Factor Analysis with Varimax Performs Statistical Inference’
Authors: Karl Rohe and Muzhe Zeng, University of Wisconsin-Madison, USA

Took place on Wednesday, 11 May 2022 3-5pm (BST)

Psychologists developed Multiple Factor Analysis to decompose multivariate data into a small number of interpretable factors without any a priori knowledge about those factors [Thurstone, 1935]. In this form of factor analysis, the Varimax factor rotation redraws the axes through the multidimensional factors to make them sparse and thus make them more interpretable [Kaiser, 1958].

Charles Spearman and many others objected to factor rotations because the factors seem to be rotationally invariant [Thurstone, 1947, Anderson and Rubin, 1956]. These objections are still reported in all contemporary multivariate statistics textbooks. However, this vintage form of factor analysis has survived and is widely popular because, empirically, the factor rotation often makes the factors easier to interpret. We argue that the rotation makes the factors easier to interpret because, in fact, the Varimax factor rotation performs statistical inference.

We show that Principal Components Analysis (PCA) with the Varimax axes provides a unified spectral estimation strategy for a broad class of semi-parametric factor models, including the Stochastic Blockmodel and a natural variation of Latent Dirichlet Allocation (ie 'topic modeling'). In addition, we show that Thurstone’s widely employed sparsity diagnostics implicitly assess a key leptokurtic condition that makes the axes statistically identifiable in these models. Taken together, this shows that the know-how of Vintage Factor Analysis performs statistical inference, reversing nearly a century of statistical thinking on the topic. We illustrate these techniques use on two large bibliometric examples (a citation network and a text corpus). With a sparse eigensolver, PCA with Varimax is both fast and stable. Combined with Thurstone’s straightforward diagnostics, this vintage approach is suitable for a wide array of modern applications.

Download the preprint

The paper will be published in the Journal of the Royal Statistical Society, Series B.

Paper: 'Experimental Evaluation of Algorithm-Assisted Human Decision-Making: Application to Pretrial Public Safety Assessment'
Authors: Imai et al.
Tuesday 8 February, 2022
To be published in JRSSA.

Despite an increasing reliance on fully-automated algorithmic decision-making in our lives, human beings still make consequential decisions. We develop a statistical methodology for experimentally evaluating the causal impacts of algorithmic recommendations on human decisions. We also show how to examine whether algorithmic recommendations improve the fairness of human decisions and derive the optimal decision rules under various settings. We apply the proposed methodology to preliminary data from the first-ever randomized controlled trial that evaluates the pretrial Public Safety Assessment (PSA) in the criminal justice system. A goal of the PSA is to help judges decide which arrested individuals should be released. We find that providing the PSA to the judge has little overall impact on the judge's decisions and subsequent arrestee behavior. However, we find that the PSA may help avoid unnecessarily harsh decisions for female arrestees while it encourages the judge to make stricter decisions for male arrestees who are deemed to be risky. For fairness, the PSA appears to increase the gender bias against males while having little effect on any existing racial differences in judges' decision. Finally, we find that the PSA's recommendations might be unnecessarily severe unless the cost of a new crime is sufficiently high.
Download the preprint

Statistical aspects of the Covid-19 pandemic
A multi-paper meeting featuring three discussion papers
Took place at Manchester Central (at the RSS 2021 Conference) and online
8 September 2021, 5.15-7.15pm (BST)

Chair: RSS President Sylvia Richardson

Paper 1:  'Modeling the Covid-19 infection trajectory: a piecewise linear quantile regression'
Authors:  F Jiang (Fudan University, Shanghai, China); Z Zhao, (University of Notre Dame, Indiana, USA); X Shao (University of Illinois Urbana-Champaign, USA)

We propose a piecewise linear quantile trend model to analyse the trajectory of the Covid-19 daily new cases (ie the infection curve) simultaneously across multiple quantiles. The model is intuitive, interpretable and naturally captures the phase transitions of the epidemic growth rate via change-points. Unlike the mean trend model and least squares estimation, our quantile-based approach is robust to outliers, captures heteroscedasticity (commonly exhibited by Covid-19 infection curves) and automatically delivers both point and interval forecasts with minimal assumptions. Building on a self-normalized (SN) test statistic, this paper proposes a novel segmentation algorithm for multiple changepoint estimation. Theoretical guarantees such as segmentation consistency are established under mild and verifiable assumptions. Using the proposed method, we analyse the Covid-19 infection curves in 35 major countries and discover patterns with potentially relevant implications for effectiveness of the pandemic responses by different countries. A simple change-adaptive two-stage forecasting scheme is further designed to generate short-term prediction of Covid-19 cumulative new cases and is shown to deliver accurate forecast valuable to public health decision-making.
Download the preprint
Supplementary material
Paper 2:  'Quantifying the economic response to Covid-19 mitigations and death rates via forecasting Purchasing Managers' Indices using Generalised Network Autoregressive models with exogenous variables' 
Authors: G Nason & J Wei, Imperial College London, UK

Knowledge of the current state of economies, how they respond to Covid-19 mitigations and indicators, and what the future might hold for them is important. We use recently-developed generalised network autoregressive (GNAR) models, using trade determined networks, to model and forecast the Purchasing Managers’ Indices for a number of countries. We use networks that link countries where the links themselves, or their weights, are determined by the degree of export trade between the countries. We extend these models to include node-specific time series exogenous variables (GNARX models), using this to incorporate Covid-19 mitigation stringency indices and Covid-19 death rates into our analysis. The highly parsimonious GNAR models considerably outperform vector autoregressive models in terms of mean-squared forecasting error and our GNARX models themselves outperform GNAR ones. Further mixed frequency modelling predicts the extent to which that the UK economy will be affected by harsher, weaker or no interventions
Download the preprint
Supplementary material
Paper 3: 'Small Data, Big Time - A retrospect of the first weeks of Covid-19'
Author: Q Zhao, Statistical Laboratory, University of Cambridge, UK

This article reviews some early investigations and research studies in the first weeks of the coronavirus disease 2019 (Covid-19) pandemic from a statistician’s perspective. These investigations were based on very small datasets but were momentous in the initial global reactions to the pandemic. The article discusses the initial evidence of high infectiousness of Covid-19 and why that conclusion was not reached faster than in reality. Further reanalyses of some published Covid-19 studies show that the epidemic growth was dramatically underestimated by compartmental models, and the lack of fit could have been clearly identified by simple data visualization. Finally, some lessons for statisticians are discussed.
Download the preprint

View our playlist of recent Discussion Meetings
Read past Discussion Papers