View Proposal


Proposer
Marta Vallejo
Title
Machine Learning Analysis of Protein Aggregate Signatures in Motor Neuron Disease
Goal
To develop and evaluate machine learning models that use quantitative features from protein aggregate morphology and composition to classify Motor Neuron Disease (MND) cases, identify key biomarkers, and explore multi-modal data integration for improved diagnostic accuracy.
Description
This project will use tabular data derived from a recent study on nanoscopic protein aggregates in Motor Neuron Disease (MND) to investigate whether morphological and compositional features of TDP-43 assemblies can accurately classify disease states and uncover key biomarkers. The dataset contains quantitative parameters extracted from microscopy (e.g., aggregate count, length, eccentricity, localisation density, fluorescence intensity) and proteomics analyses (e.g., relative protein abundance ratios). These features are available in a structured tabular format, enabling analysis without requiring image processing. The student will apply a range of data science techniques, including supervised classification (e.g., logistic regression, random forest, gradient boosting, neural networks), unsupervised clustering (e.g., PCA, UMAP, t-SNE), and feature importance analysis (e.g., SHAP values) to identify the most predictive features for distinguishing MND subtypes from controls. An optional extension is to integrate the morphological and proteomics datasets to explore whether multi-modal modelling improves classification performance.
Resources
Background
https://www.biorxiv.org/content/10.1101/2025.03.04.641150v1.full.pdf
Url
Difficulty Level
Variable
Ethical Approval
Full
Number Of Students
2
Supervisor
Marta Vallejo
Keywords
Degrees
Bachelor of Science in Computer Science
Master of Science in Artificial Intelligence
Master of Science in Computing (2 Years)
Master of Science in Data Science
Master of Science in Software Engineering
Bachelor of Science in Computing Science
Bachelor of Science in Statistical Data Science
BSc Data Sciences