The RSS Glasgow Local Group welcome Hannah Rose Kirk (University of Oxford & Alan Turing Institute)
Large language models (LLMs) like ChatGPT and GPT-4 have captured public attention due to their impressive ability to perform a wide range of tasks, from composing poetry, writing code, assisting meal-planning or summarising text. But beneath these impressive capabilities, lies the crucial role of statistics in modern language modelling.This talk delves into the role of statistics in language modeling and the potential ways that these models might transform the field of statistics in the future. It provides an overview of how LLMs work under the hood, via learning conditional probabilities from vast amounts of text data and fine-tuning these associations through reinforcement learning from human feedback. The talk also discusses the challenges of training and deploying language models, including concerns about bias, safety, and interpretability. Lastly, it examines the potential applications of language models in the field of statistics, such as using them to generate synthetic data for large-scale opinion-mining.
Hannah Rose Kirk (University of Oxford & Alan Turing Institute)