26 April 2016
A scalable bootstrap for massive data (RSS Series B, Volume 76, Issue 4, 2014) Download slides (Power Point), watch video (YouTube)
Autor: Michael I. Jordan is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley. He received his Masters in Mathematics from Arizona State University, and earned his PhD in Cognitive Science in 1985 from the University of California, San Diego. He was a professor at MIT from 1988 to 1998. His research interests bridge the computational, statistical, cognitive and biological sciences, and have focused in recent years on Bayesian nonparametric analysis, probabilistic graphical models, spectral methods, kernel machines and applications to problems in distributed computing systems, natural language processing, signal processing and statistical genetics. Prof. Jordan is a member of the National Academy of Sciences, a member of the National Academy of Engineering and a member of the American Academy of Arts and Sciences. He is a Fellow of the American Association for the Advancement of Science.He has been named a Neyman Lecturer and a Medallion Lecturer by the Institute of Mathematical Statistics. He received the IJCAI Research Excellence Award in 2016, the David E. Rumelhart Prize in 2015 and the ACM/AAAI Allen Newell Award in 2009. He is a Fellow of the AAAI, ACM, ASA, CSS, IEEE, IMS, ISBA and SIAM.
Co-authors: Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar
Chair: Richard Samworth, Cambridge University
The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large data sets—which are increasingly prevalent—the calculation of bootstrap-based quantities can be prohibitively demanding computationally. Although variants such as subsampling and the m out of n bootstrap can be used in principle to reduce the cost of bootstrap computations, these methods are generally not robust to specification of tuning parameters (such as the number of subsampled data points), and they often require knowledge of the estimator's convergence rate, in contrast with the bootstrap. As an alternative, we introduce the ‘bag of little bootstraps’ (BLB), which is a new procedure which incorporates features of both the bootstrap and subsampling to yield a robust, computationally efficient means of assessing the quality of estimators. The BLB is well suited to modern parallel and distributed computing architectures and furthermore retains the generic applicability and statistical efficiency of the bootstrap. We demonstrate the BLB's favourable statistical performance via a theoretical analysis elucidating the procedure's properties, as well as a simulation study comparing the BLB with the bootstrap, the m out of n bootstrap and subsampling. In addition, we present results from a large-scale distributed implementation of the BLB demonstrating its computational superiority on massive data, a method for adaptively selecting the BLB's tuning parameters, an empirical study applying the BLB to several real data sets and an extension of the BLB to time series data.
10th November 2016
‘Is the Carli Index flawed? Assessing the case for the RPIJ’ published in JRSS-A in 2015 (Vol 178:2) Download slides (PDF)
Author: Peter Levell is an economics researcher at the Institute for Fiscal Studies (IFS) and a part-time PhD student at University College London. His work at the IFS has so far covered a diverse set of subjects: from measurement issues in expenditure surveys, to issues around taxation, the distributional impact of inflation and behavioural economics. To date his academic work has focused on trying to answer questions concerning the appropriate measurement of consumer price inflation and at achieving a better understanding of household decisions over consumption and labour supply."
Chair: Paul Smith, Associate Professor in Official Statistics, University of Southampton
Discussant: Andrew Baldwin, a former employee of Statistics Canada
This paper discusses the decision in March 2013 of the UK's Office for National Statistics to replace the controversial Carli index with the Jevons index in a new version of the Retail Prices Index - the RPIJ. In doing so we make three contributions to the way price indices should be selected for measures of consumer price inflation when quantity information is not available (i.e at the `elementary' level). Firstly, we introduce a new price bouncing test under the test approach for choosing index numbers. Secondly, we provide empirical evidence on the performance of the Carli and Jevons indices in different contexts under the statistical approach. Thirdly, applying something analogous to the principle of insufficient reason, we argue contrary to received wisdom in the literature, that the economic approach can be used to choose indices at the elementary level, and moreover that it favours the use of the Jevons index. Overall, we conclude that there is a case against the Carli index and that the Jevons index is to be preferred.
6 July 2016
Estimands
Presented by Alan Philipps, Vice President Biostatistics, ICON Clinical Research and Peter Diggle, President of the RSS webcast (YouTube) slides1, slides2
Chair: Andrew Garrett
Estimands (what is to be estimated) is a new and hot topic in clinical development, in particular with the regulatory authorities who are responsible for approving new treatments. New regulatory guidance will be developed that will also address the subject of sensitivity analyses. This webinar will explore the current thinking on the topic and revisit some of the earlier work to consider how the topic has evolved over 10 years and how it might look in the future.
1 March 2016
Optimal design: getting more out of experiments with hard-to-change factors’
Presented by Professor Peter Goos - download slides (PDF), watch video (YouTube)
Discussant: Maria Lanzerath
Chair:Steven Gilmour
We introduce a new method for generating optimal split-plot designs. These designs are optimal in the sense that they are efficient for estimating the fixed effects of the statistical model that is appropriate given the split-plot design structure. One advantage of the method is that it does not require the prior specification of a candidate set. This makes the production of split-plot designs computationally feasible in situations where the candidate set is too large to be tractable. The method allows for flexible choice of the sample size and supports inclusion of both continuous and categorical factors. The model can be any linear regression model and may include arbitrary polynomial terms in the continuous factors and interaction terms of any order. We demonstrate the usefulness of this flexibility with a 100-run polypropylene experiment involving 11 factors where we found a design that is substantially more efficient than designs that are produced by using other approaches.
Peter Goos is a professor at the Faculty of Bio-Science Engineering of the University of Leuven and at the Faculty of Applied Economics of the University of Antwerp, where he teaches various introductory and advanced courses on statistics and probability. His main research area is the statistical design and analysis of experiments. He has published books on 'The Optimal Design of Blocked and Split-Plot Experiments', 'Optimal Experimental Design: A Case-Study Approach', 'Statistics with JMP: Graphs, Descriptive Statistics and Probability' and 'Statistics with JMP: Hypothesis Tests, ANOVA and Regression'.
To date, Peter Goos has received the Shewell Award and the Lloyd S Nelson Award of the American Society for Quality, the Ziegel Award and the Statistics in Chemistry Award from the American Statistical Association, and the Young Statistician Award of the European Network for Business and Industrial Statistics.
21 October 2015
Frequentist accuracy of Bayesian estimates
Presented by Bradley Efron, Max H Stein Professor of Humanities and Sciences, Professor of Statistics at Stanford University - download slides
Discussant: Andrew Gelman of Columbia University.
Chair: Peter Diggle.
Bradley's paper 'Frequentist accuracy of Bayesian estimates' was recently published in the Royal Statistical Society's Series B Journal (Volume 77 (2015), part 3). The abstract is as follows:
In the absence of relevant prior experience, popular Bayesian estimation techniques usually begin with some form of 'uninformative' prior distribution intended to have minimal inferential influence. Bayes' rule will still produce nice-looking estimates and credible intervals, but these lack the logical force attached to experience-based priors and require further justification. This paper concerns the frequentist assessment of Bayes estimates. A simple formula is shown to give the frequentist standard deviation of a Bayesian point estimate. The same simulations required for the point estimate also produce the standard deviation. Exponential family models make the calculations particularly simple, and bring in a connection to the parametric bootstrap.
Bradley Efron is Max H Stein professor of humanities and sciences, professor of statistics at Stanford University, and professor of biostatistics with the Department of Health Research and Policy in the School of Medicine. He is a former president of both the American Statistical Association and the Institute of Mathematical Statistics. A recipient of the Ford Prize of the Mathematical Association of America and of both the Wilks Medal and the Noether Prize from the American Statistical Association (ASA). In 2003 Bradley was given the inaugural Rao Prize for outstanding research in statistics by Pennsylvania State University in 2005 he received the National Medal of Science. In 2014, Bradley was awarded the Guy Medal in Gold by the Royal Statistical Society for his 'seminal contributions to many areas of statistics'.
Webcast (YouTube)
21 May 2015
Adaptive Designs
Speakers: Dr Guosheng Yin from the Department of Statistics and Actuarial Science at the University of Hong Kong - download slides (PDF)
Professor Franz Koenig from the Center for Medical Statistics, Informatics and Intelligent Systems at the Medical University of Vienna - download slides (PDF)
Chair: Robert Cuffe of ViiV Healthcare.
Paper 1: Two-stage adaptive randomization for delayed response in clinical trials
Summary
Despite the widespread use of equal randomisation in clinical trials, response adaptive randomisation has attracted considerable attention. There is typically a prerun of equal randomisation before the implementation of response-adaptive randomisation, while it is often not clear how many subjects are needed in this prephase. Real-time response-adaptive randomisation often requires patients’ responses to be immediately available after the treatment, whereas clinical responses may take a relatively long period of time to exhibit. We propose a two-stage procedure to achieve a balance between power and response, which is equipped with a likelihood ratio test before skewing the allocation probability towards a better treatment. Furthermore, we develop a non-parametric fractional model and a parametric survival design with an optimal allocation scheme to tackle the common problem caused by delayed response. We evaluate the operating characteristics of the two-stage designs through simulation studies and show that the methods proposed satisfactorily resolve the arbitrary size of the equal randomisation phase and the delayed response problem in response-adaptive randomisation.
Dr Guosheng Yin, currently a professor at University of Hong Kong, is also an adjunct professor at University of Texas MD Anderson Cancer Center. He received a PhD in Biostatistics from University of North Carolina and worked as in the Department of Biostatistics at MD Anderson Cancer Center, before becoming associate professor in the Department of Statistics and Actuarial Science at University of Hong Kong in 2009. Dr Yin was elected as a Fellow of the American Statistical Associationin 2013, and a Member of the International Statistical Institute in 2012. He is Associate Editor for the Journal of American Statistical Association, Bayesian Analysis,and Contemporary Clinical Trials. Hismain research areas include Bayesian adaptive designin clinical trials andsurvival analysis. He has publishedover 100 peer-reviewed papersand a book on ‘Clinical Trial Design: Bayesian and Frequentist Adaptive Methods’ in the John Wiley Series.
Paper 2: Adaptive graph-based multiple testing procedures
Summary
Multiple testing procedures defined by directed, weighted graphs have recently been proposed as an intuitive visual tool for constructing multiple testing strategies that reflect the often complex contextual relations between hypotheses in clinical trials. Many well-known sequentially rejective tests, such as (parallel) gatekeeping tests or hierarchical testing procedures are special cases of the graph based tests. We generalise these graph-based multiple testing procedures to adaptive trial designs with an interim analysis. These designs permit mid-trial design modifications based on unblended interim data as well as external information, while providing strong family wise error rate control. Because the adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof.
Franz Koenig is currently associate professor at the Section of Medical Statistics at the Medical University of Vienna, Austria. He is currently member of ethics committee of the Medical University of Vienna and also of the ethics committee of the community of Vienna. From 2008-2010 he was seconded to the European Medicines Agency (EMA) in London as statistical expert in the Unit Human Medicines Development and Evaluation, where he also held the Scientific Secretariat of the Biostatistics Working Party (BSWP). He was involved in the development of guidelines and assessment of statistical methods and clinical trial protocols. His main research interests are multiple testing, adaptive/flexible designs, interim analyses and data safety monitoring boards (DSMB). Professor Koenig has served as guest editor for special issues in Biometrical Journal and Statistics in Medicine. He is currently leading work on the work package ‘adaptive designs’ in the EU funded research project IDEAL and is deputy coordinator of an EU Horizon 2020 funded Marie Curie ITN network IDEAS on early drug development studies.
Webcast (YouTube)
24 February 2015
Doubly robust estimation of the local average treatment effect curve
Speaker: Elizabeth Ogburn, Assistant Professor of Biostatistics at Johns Hopkins University.
Chair: Dr Dylan Small, The Wharton School, University of Pennsylvania.
Co-authors Andrea Rotnitzky and Jamie Robins
Summary
This paper is about estimation of the causal effect of a binary treatment on an outcome, conditional on covariates, from observational studies or natural experiments in which there may be unmeasured confounding of the treatment-outcome relationship but there is a binary instrument for treatment.
The paper describes a doubly robust, locally efficient estimator of the parameters indexing a model for the local average treatment effect, conditional on covariates V, when randomisation of the instrument is only true conditional on a high dimensional vector of covariates X, possibly bigger than V. (The local average treatment effect is the treatment effect among compliers, or those subjects whose treatment value would agree with their instrument value, whether that value were 0 or 1). It also discusses the surprising result that inference is identical to inference for the parameters of a model for an additive treatment effect on the treated conditional on V that assumes no treatment-instrument interaction.
Elizabeth Ogburn (Betsy) has been an Assistant Professor of Biostatistics at Johns Hopkins University since August 2013. She received her PhD in biostatistics from Harvard University, where she worked with Andrea Rotnitzky and Jamie Robins, followed by a postdoctoral fellowship with Tyler VanderWeele at the Harvard School of Public Health Program on Causal Inference. She works on developing statistical methodology for causal inference, with a focus on novel data sources and structures; for example, using electronic medical records to inform individual-level healthcare decisions and using social network and other data that evince complex dependence among observations. The paper is published in the Journal of the Royal Statistical Society: Series B (Statistical Methodology) and is available online to subscribers of the journal.
Webcast (YouTube), Slides (PDF)