Bayesian approaches for clustering distributional features in neuroimaging experiments - report

On 21 March 2022, the RSS West Midlands local group had the privilege of hosting an online talk by Dr Michele Guindani from Department of Statistics, University of California in Irvine.

The meeting was chaired by Tim Davis and his former associate, Professor Meeker, gave a welcome speech. Kristian Romano then introduced Dr Guindani to the audience, who presented on a Bayesian statistical approach to analyse brain imaging.

Dr Guindani presented a general framework on Bayesian analysis of neuro-imaging experiment. While doing a research study, typically 'Average Maps' are used to describe the brain functioning for the individuals. Whenever experiments are conducted using dimension reduction methods or pre-processing steps, the basic assumption is the existence of common patterns. Recently, there has been an increasing recognition that brain functioning is heterogenous and varies greatly within and between individuals. The researchers are able to identify differences in activation to different stimuli, connectivity to different stimuli etc. Also, these brain activity patterns may be associated to a clinical outcome or different behaviours.

Two frameworks are discussed for describing the heterogeneity of brain patterns. The goal of this research is to identify association with clinical/behavioural outcomes with observable phenotypes (in genetics, the phenotype is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological properties, its behaviour and the products of behaviour).

                a) Hierarchical mixture models for multi-subject inference
                b) Capturing activity-spikes in in-vivo experiments in animals

Hierarchical mixtures are widely used in Bayesian non-parametrics to cluster together observations from different groups. A mixture is used to cluster subjects showing similar brain patterns. A lower-level mixture captures individual specific features.

Two cases were discussed to demonstrate the models:
a) By capturing activation patterns in multi-subject fMRI (Functional Magnetic Resonance Imaging).
Uses time series to analyse the blood oxygenation level dependent response at each voxel (the voxel is a 3-dimensional unit that embeds the signals in brain scans. As the MRI machine scans through each dimension of the brain millimeter by millimeter, voxels are formed to enclose the signals created by protons-magnet interactions) in the brain.

Uses regression model for detecting brain activations and then visually depicts the model to understand the activity patterns

b) Minnesota Twin Study - population based study, started in 1989 of same-sex reared-together male and female twins and their parents. The goal of this study was to explore the nature of the twin relationships to determine the amount of shared information within and between twin pairs. For this, they considered 8 seconds of resting-state EEG (Electro EncephaloGraphy) data from 365 MZ and 192 DZ twins. EEG is a non-invasive neuro imaging modality that captures electrical brain activity many times per second by placing recording electrodes at various locations on the head.

Spectral analysis was used on EEG data and transform the time series into its spectral representation. This decomposes a time series into a set of waves oscillating at different frequencies. The primary tool for spectral analysis is the spectral density function called Power Spectrum. Under mild assumptions the spectral density uniquely determines the properties of a stationary time series. The spectral density curve shows the proportion of the total variance of a time series that is explained by waveforms oscillating at each frequency.
               
It was observed that EEG spectra tend to be similar within twin pairs and more so within MZ twins. Goal is to estimate individual spectral densities and cluster similar spectra (twins) to improve inference on their features. The Whittle Likelihood is used as an approximation of the true likelihood in spectral analysis.

One main motivation of spectral analysis of EEG data from a twin design is to establish Endo Phenotypes (neurobilogical indicators that link psychiatric disorders to genetic risk factors). Heritability is the proportion of variance caused by generic factors and environmental factors on a specific trait in population and this is done using Falconer's formula.

More on Falconer's formula at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635617/#:~:text=Falconer's%20Formula%20is%20applied%20which,4%2C%208%2C%209%5D

Conducted the experiment and analysed the data provided by 565 males and 551 females, all aged 17 years and it was observed that some of the variation are driven by generics.

The following are the remarks on MTFS analysis:
a) Without any information on twin relationships, the nested DP often detected the twin structure in the data, clustering MZ twins Cz power spectra together in 26.4% of posterior samples compared with 8.9% for DZ twins and only 4.6% for non-twins.
b) In EEG spectral density analysis the frequency with the highest spectral power over the entire spectrum and within the alpha band are of interest
c) Both peak frequency and alpha peak frequency showed high levels of heritability, though the alpha peak appeared to be more heritable.

The following are the take aways:
a) Use of nested/hierarchical miixtures allows to share information across subjects leading to borrowing of strength and improvements in estimation. In the application to resting-state EEG data, this results in smoothing and clustering of spectral densities.
b) The spectral densities are highly heritable: a large proportion of the variability in the spectral density can be attributed to genetic factors, ie the entire spectral density as an endophenotype
c) Once a trait has been estabilished as heritable, it still remains to show this genetic risk factor is related to disease under study
Also the experimental study to detect spikes in noisy calcium imaging data using bayesian nonparametric analysis was also presented to the audience.

The following are the summary and conclusions of the presentation:
a) The development of neuroimaging biomarkers for targeted interventions requires to take into account the complexity and heterodeneity of brain functioning
b) Statistical approaches play a crucial role
c) Close colloboration with neuroscientists is essential
d) Hierarchical Bayesian methods allow to elegantly borrow information across and within subjects
e) Many opportunities, still a lot of work to do.
 
Dr Guindani clarified all the questions related to the cluster, spike model and calcium density. Tim Davies thanked Dr Guindani for a presentation that was regarded by all present as educational and informational.

Load more