RSS Business & Industrial Section: From a Berkshire Farm to a Fishing Expedition via Random Points

Date: Wednesday 29 March 2023, 4.00PM
Location: RSS, London
Royal Statistical Society, 12 Errol Street, London EC1Y 8LX
Section Group Meeting


Share this event

Being accused of going on a fishing expedition to find small p-values is surely one of the greatest fears for a professional/academic statistician. This talk reveals how, when faced with a need to conduct numerous, potentially speculative tests, the speaker found a way to undertake the analyses with appropriate scientific rigour and maintain his professional integrity.
 
The multiple testing problem is well known to statisticians who are often trained to account for the issue by using techniques such as Bonferroni correction, Tukey’s range test, assessing the False Discovery Rate, or simply using a level of significance stricter than might usually be adopted. Another solution is to avoid encountering the problem in the first place by reducing the number of hypothesis tests undertaken through careful experimental design and consideration only of pre-specified hypotheses supported by pre-existing external evidence/theories.
 
The talk follows the speaker’s journey from a classical training in (Applied) Statistics (including the planning of an experiment on a Berkshire farm) which engendered a fear of multiple testing, through designing ways to cope with the issue when testing the randomness of lottery machines and roulette wheels, to applying Data Science techniques to simultaneously assess many thousands of potential relationships in a fishing expedition more akin to working on a trawler than sitting by a lake.
 
The audience are encouraged to accompany the speaker on the journey, at each stage realising that what was previously thought improper and unjustifiable can, with care and vigilance, be allowed as part of the scientific method.
 
Our end point is a study (with co-authors Mayor, Steffert and Beggan) exploring the effect of space and terrestrial weather conditions upon EEG/ECG readings collected during transcutaneous electroacupuncture stimulation (TEAS) sessions. With the nature of any relationships being highly uncertain, we adopt, and seek to extend, HCTSA (Highly Comparative Time-Series Analysis) to simultaneously examine over 7,700 features of the data. We adapt train/test methods to maintain a level of control over the multiple testing problem and satisfy ourselves that we are scientists properly examining each fish caught in our trawler’s net and throwing back those which are not truly interesting.
 
 
Neil Spencer (University of Hertfordshire)
(co-authors David Mayor, Tony Steffert, Ciarán D. Beggan)
 
 
Andrei Bejan for RSS Business & Industrial Section
 
Free to RSS Fellows
£10 for non-Fellows