Details - MACS Project System

View Proposal

Proposer: Marta Vallejo
Title: Modelling Disease Progression in ALS Mouse Models Using Multimodal Data - Collaboration with the University of Zaragoza (Spain)
Goal: Working with real clinical data and help to understand better the mechanisms in ALS.
Description: Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that currently lacks effective predictive tools for stratifying patients or understanding early markers of disease severity. In this project, you will focus on applying machine learning and deep learning techniques to data obtained from transgenic SOD1G93A mice, a widely used preclinical model of ALS. The goal is to classify animals into fast vs slow disease progression categories based on multimodal data collected longitudinally. This classification will support the development of tools for early prognosis and guide translational work in human ALS patients.
Resources: The dataset comprises a comprehensive collection of preclinical information collected from SOD1G93A mice. Specifically: • Ultrasound imaging and measurements: muscle dimensions from two anatomical locations, with upcoming experiments providing three longitudinal ultrasound measurements for each hindlimb, alongside body weight over time. • Surgical imaging: images taken before and after muscle biopsies, as well as images of the extracted muscle tissue itself. • Post-surgical recovery images: visual records of animal recovery 1–2 days after biopsy. • Molecular data: gene expression profiles of selected biomarkers obtained from muscle biopsies. • RNA metrics: quantity and quality of RNA extracted from each biopsy, measured at three key time points: early symptomatic, late symptomatic and endpoint stage. These data form a comprehensive view of the disease's physical, molecular, and morphological progression in each subject.
Background: The project will involve the design of a classification pipeline to distinguish fast vs slow progressors using the data modalities above. You will begin with feature extraction techniques, including: • From ultrasound images: use of radiomics (texture, intensity histograms, shape), traditional image processing (e.g., edge detection, HOG), and possibly convolutional neural networks (CNNs) pretrained on medical images (transfer learning). • From molecular data: feature selection based on biomarker variance, correlation analysis, and unsupervised clustering to explore transcriptomic profiles. • From clinical metadata (e.g. weight curves and RNA yield): time-series features such as rate of weight loss or variability in RNA quality. These features will feed into classification models such as Random Forests, Support Vector Machines, and XGBoost, with the possibility of exploring early fusion strategies combining imaging and omics data. Model performance will be evaluated using accuracy, sensitivity, specificity, AUC, and MCC. Emphasis will be placed on model interpretability, using SHAP for feature attribution and Grad-CAM for CNN-based imaging analysis.
Url
Difficulty Level: Variable
Ethical Approval: Full
Number Of Students: 3
Supervisor: Marta Vallejo
Keywords: machine learning, deep learning, healthcare
Degrees: Bachelor of Science in Computer Science
Master of Science in Artificial Intelligence
Master of Science in Computing (2 Years)
Master of Science in Data Science
Bachelor of Science in Computing Science
Bachelor of Science in Statistical Data Science
BSc Data Sciences

Back to List