Workshop on 'sustainable computing' - report

On 24 March 2021, the RSS Computational Statistics and Machine Learning Section held a meeting on Sustainable Computing. Three speakers spoke on the following topics:

  • Dr Chris Maynard (Met Office, University of Reading) - 'How to train your supercomputer'
  • Dr Partha Maji (Arm Machine Learning Research Lab) - 'Accelerating Your ML Algorithms: A Hardware-Centric View'
  • Jacob Tomlinson (NVIDIA) - 'GPU accelerated Python data science'

Significant advances in computer hardware and software, in combination with the ever-increasing availability of huge dataset, are enabling the deployment of machine learning algorithms and statistical models to solve a range of highly complex problems that underpin real-life applications.  In this event, organised by the Computational Statistics and Machine Learning, the three speakers introduced some of the ways that the increase in computer power, including via cloud computing, can be efficiently harnessed when developing complex models.

Chris Maynard kicked off the event by presenting the paradox created by the new generation of supercomputers. The increase in computational power is no longer a result of better and faster computers, but instead access to more computers. This raises the challenge of rethinking our models and algorithms, rather than simply porting well-known approaches which were designed in a sequential world.

Something that is sometimes overlooked when developing computer-intensive machine learning algorithms is the actual computer architecture being used. As described by Partha Maji, running a machine learning model has a cost that can be reduced by carefully considering the type of computers being used to run the applications.

To conclude the event, Jacob Tomlison gave a whistle-stop tour of the Dask. This suite of tools uses Python-like syntax and makes advanced parallel computing accessible to statisticians and data scientists.

Some of the discussions throughout the session touched upon the difficulty in defining the actual cost of models in terms of power and computer resources. A ‘eCO2’ value measuring the carbon footprint of a computer model is an interesting but extremely difficult concept as the cost of a model is not completely independent of the hardware or computer system on which the model is being run.

The event provided lots of stimulating conversations on the interface between computer science and statistics which the CSML section will aim to follow up with future events.

Watch a video of the workshop on our YouTube channel.

Report written by Dr Camille Szmaragd Harrison, CStat (Methodologist, ONS)

Load more