All Proposals

588 proposals found
Title
Proposer
Campus

Reproducibility of Human Data Collection for Machine Learning Models
machine learning nlp data collection evaluation
Gavin Abercrombie
Edinburgh
Much of current Machine Learning (ML) is based on human data (i.e. supervised learning, Reinforcement Learning from Human Feedback (RLHF)) and the performance of even very Large Language Models (LLMs) is highly reliant on the quality of the collected data. Following reproducibility crises in other fields, such as Psychology, researchers have begun to examine the extent to which data collection for ML applications such as Natural Language Processing (NLP) is reproducible, finding that it is often difficult or impossible to reproduce the studies [1]. While Machine Learning data sets have typically assumed a single correct label for each data instance, recent work has sought to reflect a range of perspectives [2]. This is because people don’t universally agree, so it is unrealistic to assume that there is 1 “agreed” upon label for each data sample. This project will explore the reproducibility and validity of such data collections, i.e. how well a second data collection can reproduce the results of the original one, and the extent to which humans disagree on the labelling tasks. The project can be tailored to suit the students’ interests, i.e. it can involve data collection and analysis, and also implementation and reassessment of state of the art ML models’ performance.

Safety and Bias in Dialogue Systems and Large Language Models
safety bias nlp dialogue systems large language models
Gavin Abercrombie
Edinburgh
Automated dialogue systems are becoming ubiquitous in our homes, on our smart devices, and on the internet, and, with recent advances in Large Language Models, the quality of chatbots and voice assistants is rapidly improving — to the extent that they can sometimes be mistaken for humans [1]. But as the quality of end-to-end dialogue systems improves, so does their capacity to learn unsafe behaviours from data on which they are trained [2]. They also run the risk of responding inappropriately to unsafe or toxic user input [3]. Potentially undesirable behaviours include offensive outputs (abuse, hate speech etc.), as well as the failure to detect and mitigate such language in user inputs, and generation of inappropriate responses in safety-critical situations, such as offering medical or legal advice, or responding to/generating sensitive content. In this project, we will examine one or more of the following aspects of safety for conversational AI: - Evaluation and/or detection of unsafe user input and/or system outputs (abusive language, safety critical topics, sensitive content etc.). - Evaluation of societal biases in the outputs of conversational systems and Large Language Models (LLMs) - Mitigation of unsafe content e.g. evaluation of system response strategies, generation of appropriate responses.

Learning with Disagreement
nlp nlg classification
Gavin Abercrombie
Edinburgh
Traditionally, NLP datasets have consisted of data annotated with single 'gold standard' labels. But in reality, people often disagree on their interpretation of the meanings behind natural language expressions. This project will tackle the problem of how to model datasets that include multiple labels for each item and harness the disagreement information for better classification and generation performance across a range of NLP tasks. There will also be the opportunity to enter the 2024 shared task on Learning with Disagreements (LeWiDi).

Decision tree-based classification of red-light violation among drivers
Eric Nimako Aidoo
Dubai
Traffic lights are one of the road transportation systems designed at the road environment to regulate competitions among road users at intersections. In the absence of traffic lights at intersections road users are at risk of road crashes. Although traffic lights serve as a medium of regulating conflicts among road users at intersections, different studies have shown that not all road users comply with the red signals. Thus, classification of red-light violation among drivers will be important to support training and policies in road transportation safety. In this study, decision tree-based model will be developed to classify red-light violations and the associated risk factors among drivers.

Machine Learning Applications in Disease Modelling
machine learning models noncommunicable diseases incidence model comparison
Eric Nimako Aidoo
Dubai
Noncommunicable diseases such as cancer, diabetes, respiratory diseases, and cardiovascular diseases remain major challenges in global health management. Despite advancements in healthcare delivery and accessibility, it is estimated that 41 million people die annually due to noncommunicable diseases. Among several factors, climate change is increasingly recognized as a significant factor influencing the incidence and mortality of noncommunicable diseases. For instance, existing studies have shown that extreme temperatures impact cardiovascular diseases, while air pollution affects respiratory diseases. The need for effective and efficient machine learning ensembles to uncover complex patterns and relationships in such data has become important.

Machine Learning Applications in Road Traffic Crashes
machine learning models road traffic crashes model comparison
Eric Nimako Aidoo
Dubai
Road traffic crashes is one of the leading course of death across the globe. It has a significant impact on the victims, family, society and the economy as a whole. Despite advancements in road infrastructure and automobiles, it is estimated that approximately 1.19 million people die each year worldwide due to road traffic crashes. There are several factors that influence road traffic crashes and the associated mortality. The need for effective and efficient machine learning ensembles to uncover complex patterns and relationships in such data has become important.

Machine Learning-based Street Sign Detection for Road Safety Enhancement
Eric Nimako Aidoo
Dubai

Cryptanalysis of Encryption Schemes
Fadi Alhaddadin
Dubai

Cloud architecture and health informatics
Fadi Alhaddadin
Dubai

Data Privacy
Fadi Alhaddadin
Dubai

Green Computing
Fadi Alhaddadin
Dubai

Graphical character for conversational interaction
face animation python
Matthew Aylett
Edinburgh
In human dialog, participants are able to interrupt each other at any point. Once a dialog participant has been interrupted they may cede the floor (let another person speak) or alter what they are saying to show they are actively listening. In this project a graphical character will be implemented using Unity and Python that allows some limited behaviour (e.g. head nods, eye brow raise), that is integrated with a speech synthesis system (CereProc) allowing basic lip syncing and output speech. An API will be designed and built over http that will allow a client Python program to stream content (e.g. text for the character to speak and instructions for behaviour), poll the character (establish within a narrow time window what the character has said or done from the streamed content), and interruptible (will allow the system to stop the character gracefully or to stop the character and begin producing new content).

A GRAPHICAL USER INTERFACE FOR CONSTRUCTING AND EDITING VOCAL PUPPETRY
speech technology hci gui programming
Matthew Aylett
Edinburgh
Neural TTS (text to speech) systems can allow very fine specification of speech output allowing the intonation and speech rate from a source speaker to be used to guide a synthetic voice and replicate the source speakers delivery. Much spoken output generated by TTS does not need to be in real time. For example, producing audio for a speech or media performance. CereProc Ltd has developed a prototype system for taking source speech and transcription and creating XML markup to realise the same delivery in a synthetic engine. In this project a graphical user interface (GUI) will be produced to allow interactive tuning of this output. The GUI will be evaluated against a set of users and output compared to baseline TTS.

Masterclass in cloud speech technology
Matthew Aylett
Edinburgh
There are many cloud resources now available for speech synthesis and speech recognition. In a simple form they can be used to add speech functionality to web pages, or speech functionality to standalone applications and digital games. Many of the systems can be either run on a request by request basis or streamed (to process before completion). Speech recognition can be configures to work slower for better recognition or faster, it can be given pronunciation of expected and unusual words such as proper name. Speech synthesis offers many different voices and often fine control of speech style and intonation. The objective of this project is to produce teaching materials to support the use of these systems and a tutorial to give a hand on experience of using them.

Robot Voice Separation using LSTMs
social robotics speech technology dialog systems conversational user interfaces
Matthew Aylett
Edinburgh
Current robots (furhat, haru etc) typically stop listening to their microphone when speaking to avoid interfering with speech recognition results. This means robots can't be interrupted and can't produce back channels (the yeah/okay that shows you are listening). In this project you will extend a small audio corpus built to test robot voice separation, set up the Kaldi ASR system, design and train an neural net solution for altering ASR parameters to remove the effect of the robot voice and maintain recognition accuracy.

Human Robot Interaction for Health
Lynne Baillie
Edinburgh
The honours project concept should focus on how a social robot could assist someone, through Human Robot Interaction with a health or assistive living need. Assistance and training will be given regarding the robot selected for the project. The student should be a reasonable programmer.

Social Robots Helping Children
human robot interaction
Lynne Baillie
Edinburgh
This project will explore the ways in which a social robot could assist children who have severe sight issues. The project will have three phases. In the first phase of the project, the student will explore the literature on research into children learn about the interactions they can have with a social robot. In the second phase, the student will implement and enable a set of interactions to take place with at least one social robot. The student will be required to implement these interactions on at least one social robot. Lastly, the student will deploy the social robot with the implemented interactions in order to evaluate the usefullness of the interactions with a set of students with severe sight issues They will assess which interactions were the most successful.

Quantifying cross-sectoral discrepencies between ethnic groups to support analysis into their use of key services (health, housing and energy).
Lynne Baillie
Edinburgh

Location Sharing Mobile App
Phil Bartie
Edinburgh
There are location sharing apps, even Google Maps can do that. These need to be setup in advance, such that the person sharing the information knows the person able to receive the updates. Updates are sporadic and limited to a location, and phone battery life, and minutes since the update. This project looks to extend this functionality to enable location sharing to anyone, and to groups. For example a user may register a temporary online name and be allocated a group code with the service. They then share this group code with their friends via WhatsApp. Now all in that group can see the locations of all others in that group. The user can leave or rejoin the group as they wish, and also be members of several groups. Also updates will include the current speed of movement, direction, location, timestamp, and the username. Further functionality would include allow a parent/guardian to automatically request a location from a child via the app. The server side could be developed with a web based mobile client for testing purposes, but ideally the student taking on this project would use Android Studio to develop a native app which uses the background demon capabilities of Android OS. Requirements: Spatial database (eg PostgreSQL + PostGIS), mobile (or web) client [native Android dev. preferred] , server side dev (eg Python, NodeJS, PHP), web based mapping (eg leaflet.js)

Natural Language Navigation based on images
Phil Bartie
Edinburgh
For this project computer vision will be used to return a list of objects in view (eg from a Streetview type image) from which a user can input questions that a LLM (large language model) will be able to answer based on the knowledge gained from the image for the task of navigation. The purpose is to create a service that when provided an image (or set of images) can answer questions related to navigation (ie direction, position). For example: user) which way do I head now? system) towards the building with a red door user) the one next to the bins? system) yes, head towards the bins, and the red door, the cafe will be just after that on your left - it has green windows. Topics: Scene graph, visual grounded models, LLMs, navigation tasks, UI, UX, Geographic Information Systems, Location based services

Open Source Intelligence - Image Comparison using Computer Vision
Phil Bartie
Edinburgh
A challenge in Open Source Intelligence is in finding similar but not identical images which can be used to validate an event occurred. This project will use tools such as YOLO to identify objects in scenes and OpenCV to compare images. The goal is to produce a data processing pipeline that can check a list of images and find related objects to validate they are taken at the same event, but from different angles/times. A VirtualBox VM with the necessary tools installed can be made available if required. Coding in Python preferred.

Simulated City: Spatial Agent Based Modelling
Phil Bartie
Edinburgh
Build a web-based spatial agent based modelling system for urban transport (Simulated City). This should work with real world coordinates and a map base, implementing multiple agents including the environment (map, traffic lights, vehicles of different kinds including bicycles). The agent based simulation should be able to run on a web platform, this might be in a Jupyter Notebook or as a web app. It should allow for user input of factors such as traffic light sequence timings, different road speeds. Prefered technologies: Python, PostgreSQL/PostGIS, Leaflet.js / OpenLayers.js

Streetview to Text
Phil Bartie
Edinburgh
Using image to text models this project will turn a set of streetview images (eg 4000 images for Edinburgh at known locations) into text descriptions using a machine learning mode (eg Places365, OpenAI Clip). These descriptions will then be summarised at various scales to discover similar regions.

Text to Data Tools
Phil Bartie
Edinburgh
Website can contain valuable data in the form of text (e.g. web text, PDFs, docx files). NLP (inc LLMs) allows extraction of the data within the text, parsing it to find specified items (eg dates, locations, names of people, tools, costs, and other values). This project will be specifically focus on producing an application that performs this job for specified target tasks (eg locations where natural capital tools are used). The output would be a set of web URLs, documents (eg PDFs), and the corresponding values stored in a database (e.g. locations, costs, tool names, organisation using the tools). A UI to search the data could also be developed, including highlighting spatial locations where tools are used and linking to the relevant documents.

Indoor Positioning using Computer Vision
Phil Bartie
Edinburgh
GNSS (eg GPS) is a wonderful solution for positioning a device outside. However indoors tracking is much more difficult given the obfuscation of the roof and building materials between the receiver and satellites. There are a number of options for indoor positioning including WiFi fingerprinting, setting up bluetooth beacons, IMU (e.g. foot tracking). There are also solutions based on VPS (visual positioning systems) which use the camera and computer vision to locate the user from a library of previoiusly captured images. This project will develop a simple mobile client which sends image updates from the front camera of a phone held looking upwards to a server. The server will carry out comparisons looking for correspondences within a library of previously captured images. The result would be being able to give a location back to the user which locates them on an indoor map. The project will involve computer vision (eg OpenCV, scikit-image) and development of a web app that takes a camera feed from a mobile device. For this project we particularly want to focus on how well tracking a ceiling will work for positioning around indoor spaces (eg university buildings).

Detection and Visualisation of Structural Pattern in Java Programs by Using Graph Isomorphism
structural design pattern java graph isomorphism
Thomas Basuki
Malaysia
Design patterns have been used for many years in object-oriented software development. Its use is then extended to represent many other patterns such as interaction patterns and security patterns. Design patterns are often described in diagrams such as UML diagrams. In general, design patterns can be divided into structural and behavioural design patterns. In our previous project, we have developed software that detects the occurrence of a structural design pattern in a set of Java programs. The software accepts a design pattern represented in a UML class diagram, which is stored in an XML file. The algorithm we chose extracts graph structures from both the class diagram and the program and compare them based on cosine similarity. This technique is very efficient in detecting the pattern but not very accurate. In this project, we propose to use graph isomorphism to detect the occurrence of a structural design pattern in the Java programs. We also plan to extend the software with a capability to visualise the patterns found in the programs.

Detection of Behavioral Design Pattern in Java Programs
behavioral design pattern uml java
Thomas Basuki
Malaysia
Design patterns have been used for many years in object-oriented software development. Its use is then extended to represent many other patterns such as interaction patterns and security patterns. Design patterns are often described in diagrams such as UML diagrams. In general, design patterns can be divided into structural and behavioural design patterns. In our previous project, we have developed software that detects the occurrence of a structural design pattern in a set of Java programs. The software accepts a design pattern represented in a UML class diagram, which is stored in an XML file. In this project, we propose to extend the detection of design patterns to include behavioural patterns. We may need to consider other UML diagrams for this purpose.

Studying Anagrams in Bahasa Melayu for Crossword Puzzle Generation
bahasa melayu anagram cross-word puzzle
Thomas Basuki
Malaysia
Crossword Puzzles used to be a very popular word game during the era of printed newspapers. It is a very good and fun way to improve our knowledge of a language. Nowadays, many people are still interested to play this game in some apps or websites. Software have replaced humans in generating these puzzles. There are some variants of crossword puzzles. One of them is anagram-based crossword puzzles. An anagram is a word formed by rearranging the characters in another word. In some high-resource languages such as English anagrams have been studied a lot. However, that seems not the case for Bahasa Melayu. Therefore, the first challenge in generating anagram-based crossword puzzle in Bahasa Melayu is building a database of anagrams in Bahasa Melayu. The goal of this project is to collect anagrams in Bahasa Melayu, and build a software that can generate anagram-based crossword puzzles.

EEG based authentication
eeg signnasl machine learning security
Hadj Batatia
Dubai
Use public data sets to train machine learning models to identify individuals from their EEG signals.

Graph-based code generation using LLMs
llm code generation graphs
Hadj Batatia
Dubai
The objective of the project is to represent the intended programme as a graph. A code generator would analyse the graph and generate, debug, and run the code.

Generative AI for embedded systems
real time ai llm embedded systems
Hadj Batatia
Dubai

Physics-informed neural networks for image denoising
deep learning physics informed networks inverse problems
Hadj Batatia
Dubai

Deep learning for vision-based drone localisation
aerial images satellite images drone localisation gps-less navigation deep learning
Hadj Batatia
Dubai

Self-supervised deep learning for image reconstruction
self-supervised learning inverse problems high dynamic range imaging compressed sensing
Hadj Batatia
Dubai
The developed self-supervised deep learning models can be applied to the reconstruction if HDR images from compressed sensing.

Aerial hyperspectral images for nutrients estimation in crops
hyperspectral images machine learning
Hadj Batatia
Dubai

A recommendation system for learning materials
Diana Bental
Edinburgh
Vocabularies and ontologies (such as Dublin Core and schema.org) exist to identify and classify learning resources so that teachers and learners can search for suitable resources. The student will investigate existing vocabularies and ontologies such as schema.org, and use these to design and implement a prototype system. https://www.dublincore.org/resources/metadata-basics/ https://schema.org/LearningResource https://dl.acm.org/doi/abs/10.1145/3041021.3054160

To evaluate and extend a "Bechdel test" for computer games
Diana Bental
Edinburgh
The Bechdel test for film suggests that a film should a) have at least 2 women characters b) they should talk to each other c) about something other than men. This simple test of female roles in film has produced a lot of discussion and some change in film practice. But how about computer games? Is there an equivalent test? How would it apply? is there any progress towards meeting such a thing? A previous student project has developed and evaluated a potential test. This project will replicate that test with new subjects, and develop and extend the test.

Tailored apps and tools
user model tailoring personalisation
Diana Bental
Edinburgh
Projects in user modeling, personalisation and tailored information. Build an app that provides information in some area of interest - this could be a sport activity, craft, tourist activity. The app can provide information and recommendations. Information and recommendations can be tailored according to relevant information about the user - location, time of day the app is being used, and user preferences and skills.

Digital Personalities and Virtual Influencers
Diana Bental
Edinburgh
We are increasingly engaging with "digital personalities" online and offline. This project will investigate the use of "digital personalities" and current awareness and attitudes towards Virtual Influences in society. For this project you will survey recent literature about trends in Virtual Influencers and how they are described in popualar media (news, film etc) Build on studies from literature on related technologies such as AI and Bots, and conduct interviews with members of the public to understand their attitudes and concerns. As part of this project you may also implement protoypes or small example systems which demonstrate aspects of influencer system for use in your study.

Usable privacy choices
Diana Bental
Edinburgh
You will need to: research different privacy mechanims: research different metricsa and emchanisms to evaluate how successful they are for users; identify suitable privacy interfaces and metrics; conduct a usability evaluation of the selected inrerfaces; design and prototype a privacy interface that is intended to imporve on existing interfaces in some way; evaluate the improvement. Metrics for Success: Why and How to Evaluate Privacy Choice Usability Lorrie Faith Cranor, Hana Habib Communications of the ACM Volume 66 Issue 3March 2023 pp 35–37 https://doi.org/10.1145/3581764

A "what and when to study" app
Diana Bental
Edinburgh
Students often find it difficult to schedule their work and decide when to study, what needs to be done. Apps exist which privide this kind of advice and reminders for other personal goals, such as exercise and wellbeing. An app could give personalised suggestions for study material e.g. "watch the following recordings" "try the following exercises" "stdy these handouts before the lecture"; warnings up upcoming deadlines. Devise an app in which students can set learning goals and get personalised reminders relevant to theor courses. The app should take into account external factors such as exam dates and coursework deadlines, weightings and content.The app design will need to consider and respect privacy. Digital twins and artificial intelligence: as pillars of personalized learning models Furini, Gaggi, Mirri, Montangero, Pelle, Poggi, Prandi Communications of the ACM Volume 65 Issue 4 April 2022 pp 98–104 https://doi.org/10.1145/3478281

Heterogeneous Ensemble Topic Models
topic modelling large language models ensemble methods
Pierre Le Bras
Edinburgh
Ensemble approaches to modelling topics from large test corpora have shown to improve topic coherency topics and model stability, compared to traditional approaches. However, most attempts so far have concentrated on the evaluation of homogeneous ensembles. The emergence of new topic modelling systems (e.g., BERTopic, Top2Vec) offers the possibility to explore heterogeneous ensembles, mixing these new approaches to classical ones (e.g., NMF, LDA). This project would involve the integration of multiple topic modelling technique in one system, followed by the computation of ensemble topic models (topical alignment and/or weighted term co-association), and finally the evaluation of several metrics of interests.

Comparison of Topic Model Visualisations
topic model visualisation user study
Pierre Le Bras
Edinburgh
The data generated by topic models is a rich multi-dimensional set of probabilities, which naturally poses challenges when presented to non-expert users. Over the years, several systems have been built to allow this data exploration by visualising the output of topic models, for example: LDAVis, BERTopic, Topic Mapping (see URL). This project proposes to establish the affordances and hinderances of these many systems empirically by designing a user-based study to quantitatively and qualitatively measuring key metrics. The project would preferably involve the creation of interactive interfaces (one per method) and the iterative development of a rigorous user study, followed by the evaluation of results.

Open project in interactive data visualisations
data visualisations interactive systems
Pierre Le Bras
Edinburgh
This project is open to students with an interest or idea involving data visualisations, please contact me to discuss your ideas BEFORE selecting this project. Example of projects include: - building bespoke interactive visualisations for complex datasets - user evaluation of interactive data visualisation systems - educational/explainable software involving intuitive data visualisation - spatial data visualisation systems

Empirical Study of Grid Mapping Visualisation
data visualisation grid mapping similarity data clusters
Pierre Le Bras
Edinburgh
Visualising the similarity items is a multi-dimensional problem that requires the implementation of mapping strategies, inevitably introducing errors and making compromises. This project proposes to establish a list of viable strategies, implement them and evaluate their performance against selected metrics.

Building a Corpus Analysis Pipeline in Rust
text analysis data mining rust data pipeline
Pierre Le Bras
Edinburgh
While Python has established itself as the de facto data processing and analysis programming language for years, other languages and their features have been seemingly left out. The project aims to investigate how the programming language Rust can perform when building a text corpus analysis configurable pipeline.

Knowledge Graphs for Medical Images
Albert Burger
Edinburgh
Medical images contain a lot of information that needs to be captured for further analysis and to link it to associated other medical data. In this project you will develop a Knowledge Graph, using the graph-based database system Neo4j and RDF, to model aspects of the human gut. Based on this you will develop a set of semantic query solutions to answer typical questions on the data set provided.

Performance Evaluation of Graph Database Indexing Techniques
Albert Burger
Edinburgh
Indexing is a common technique in databases to improve performance. As part of this project you will study the indexing features provided by the graph-based database system Neo4j and design and run a set of experiments to evaluate the performance improvements that can be achieved.

Visualisation-based Graph-DB Comparison
Albert Burger
Edinburgh
As part of an ongoing research project we model human gut anatomy in the form of a graph database, Neo4j. Other research groups have created similar databases and it is important to be able to systematically identify and understand the variations between two different data sets over the same domain. Similarly, an existing database will evolve over time due to addition/deletion/modification of data elements. Here it is important to be able to systematically identify the differences between different versions of the same database. To achieve this, you will use visualisation techniques that can be applied in Neo4j, for example their Bloom tools, to make it easy for an end user to identify and understand variations in the underlying databases.

Integration of Heterogeneous Biomedical Databases using a Virtual Knowledge Graph System
Albert Burger
Edinburgh
As part of an ongoing research project there is a need to integrate data, currently stored in a number of different data repositories, into a single coherent system for analysis purposes. Specifically, in this project you will explore the integration facilities of the virtual graph system Ontop to answer queries across multiple data sets.

Using Generative AI to Create a Knowledge Graph for a Biomedical Domain
generative ai knowledge graphs neo4j
Albert Burger
Edinburgh
Generative AI is now widely used to tackle a variety of computational problems. In this project you will explore Neo4j’s LLM Knowledge Graph Builder to create a knowledge graph in the Neo4j Graph Database. The new knowledge graph will then be interrogated using LLM chats as well as the Cypher query language. You will have to familiarise yourself with the Neo4j graph database and generative AI tools. The use case for the project will be based on a current biomedical research project, though no previous biomedical knowledge is required.

Large Language Models (LLM) for Topic Modelling
machine learning large language models topic modelling
Mike Chantler
Edinburgh
To evaluate the use of LLM to create topic models of huge document collections and to compare these against conventional tools such as LDA techniques. Particular concerns are topic stability and topic quality. See url for example of a graphical topic model of > 200,000 document repositories.

Interactive Visualisation of Optimisation Systems
optimisation visualisation
Mike Chantler
Edinburgh
There are many types of optimisation which is the process of finding the inputs to system that maximise (or minimise) it's output. Example approaches include genetic algorithms, simulated annealing and hill-climbing. (See url for examples) This project would seek to develop a web app that would illustrate the performance of such systems of varying complexity.

Machine Learning for Mechatronic Optimisation of Laser Systems
machine learning image processing optimisation laser systems
Mike Chantler
Edinburgh
Most laser systems are currently tuned by hand. This project would produce a produce a program that could use the intensity distribution in images of laser beams to tune their output. Program would be written in python and involve both NN methods and image processing.

Visualising research on Global Warming
topic modelling global warming large language models
Mike Chantler
Edinburgh
This web app would provide an at-a-glance visualisation of global warming research. It would use topic modelling and Heriot-Watt's topic modelling toolkit. It would provide a similar thematic analysis to that illustrated by our "Visualising Covid-19 Research" paper (see url). The aim is to visually show in which themes world global warming research is (and is not) focussing on, and how these topics are developing over time. It is programmed in a mixture of Java and JavaScript. Topic modelling algorithms may be LLMs or conventional (LDA).

Visual comparison of international research
llms machine learning javascript graphics
Mike Chantler
Edinburgh
The web now makes available vast collections of research project descriptions (e.g. Gateway to research at https://gtr.ukri.org/). Different countries use different classification systems to code their national research portfolio and so it is difficult to directly compare research across nations. However, LLM-based Topic modelling provides a means to produce thematic analysis independent of any classification system. This project would therefore develop an interactive, graphics-based system to allow visual exploration and comparison of international research portfolio themes.

Visualising Net-Zero research
llms machine learning javascript graphics
Mike Chantler
Edinburgh
Use of LLM topic modelling and D3/JavaScript to produce a web app to visualise trends and themes in Net-Zero research.

Multidimensional data visualisation in d3.js
visualisation javascript d3 data analysis
Mike Chantler
Edinburgh
There is a confusing plethora of dimensionality reduction methods that project multidimensional data into 2D with the aim of making trends or clusters more obvious to the viewer. This project would develop a web app that would evaluate such dimensionality reduction methods, choose the best ones, and show the results in multiple windows to allow a user to quickly navigate potential reduction methods.

Computer Vision and Imaging topics - AI for Multi-modality Image Processing
computer vision deep learning multi-modality image fusion
Dongdong Chen
Edinburgh
Multi-modality image fusion is a technique that combines information from different sensors or modalities to produce a fused image that retains complementary features from each modality. However, effectively training such fusion models is challenging due to the lack of ground truth fusion data. This project focuses on implementing/developing advanced AI models for multi-modality image fusion, e.g. learn to Multi-modality Image Fusion without Groundtruth. If you’re interested in doing this project, please have a look at the papers listed below. https://arxiv.org/pdf/2305.11443.pdf

Computer Vision and Machine Learning Topics - Advanced AI models for Data Clustering
deep learning machine learning clustering analysis computer vision visualization
Dongdong Chen
Edinburgh
Clustering is the task of grouping unlabeled data points together based on their similarities. It is an unsupervised machine learning task, meaning that the data points do not have any labels associated with them. In contrast, classification is the task of assigning labels to data points. If the data points are labelled, then clustering can be used as a preprocessing step to improve the performance of classification algorithms. Deep learning (Neural Networks) can learn useful representations from data. This project will focus on implementing and extending the state-of-the-art deep learning models for clustering analysis.

Computer Vision and Imaging topics - Self-Supervised Learning for Image Reconstruction
computer vision image reconstruction equivariant imaging medical imaging signal processing self-supervised learning
Dongdong Chen
Edinburgh
In recent years, deep learning has achieved state-of-the-art performance in various imaging inverse problems, including medical imaging and computational imaging. These methods typically require pairs of signals and their corresponding measurements for training. However, in many imaging problems, we only have access to degraded or undersampled measurements of the underlying signals, which limits the applicability of learning-based approaches. The recent equivariant imaging (EI) framework overcomes this limitation by exploiting the invariance to transformations (translations, rotations, etc.) present in natural signals. EI is fully self-supervised and can recover the signals from their measurements alone. This project focuses on applying EI to different inverse problems including but not limited to medical image (e.g. CT/MRI) reconstruction and image restoration (super-resolution, denoising, debluring, etc.). If you’re interested in doing this project, please have a look at the papers listed below.

Computer Vision and Imaging Topics: Generative Modeling and its Application in Image Processing
computer vision diffusion models image generation image processing
Dongdong Chen
Edinburgh
Recent advances in AI-based Image Generation spearheaded by Diffusion models such as Glide, Dalle-2, Imagen, and Stable Diffusion have taken the world of “AI Art generation” by storm. Generating high-quality images from text descriptions is a challenging task. It requires a deep understanding of the underlying meaning of the text and the ability to generate an image consistent with that meaning. In recent years, Diffusion models have emerged as a powerful tool for addressing this problem. This project will focus on applying Diffusion Models for image generation, such as painting generation and image fusion. https://arxiv.org/abs/2006.11239 https://arxiv.org/abs/2011.13456 https://arxiv.org/abs/2303.06840 https://arxiv.org/abs/2305.08995

Deep Learning for Imaging and Low-Level Computer Vision
deep learning machine learning signal processing image reconstruction computer vision
Dongdong Chen
Edinburgh
Inverse problems are ubiquitous in computer vision, image processing, and signal processing. Many research or industrial scenarios essentially involve solving inverse problems, such as image super-resolution, image denoising, computational photography, astronomical imaging, medical image reconstruction, etc. Deep learning is one of the main pillars of modern AI due to its powerful learning capabilities. Exploring deep learning and AI solutions for solving inverse problems is a frontier topic. This project will investigate the fascinating and cutting-edge topic of deep learning for inverse problems. The students will also explore and develop the "Deep Inverse" library (see https://deepinv.github.io/deepinv/) - a Pytorch-based open-source library developed by an international and growing team for solving inverse imaging problems with deep learning. Students will be encouraged to publish academic papers if their progress is excellent. DeepInverse: https://deepinv.github.io/deepinv/ Background about deep learning for inverse problems: https://ieeexplore.ieee.org/abstract/document/10004796?casa_token=OWX4u5zTQxUAAAAA:4xqldZHDeZTsQaJLArqxMQqdkHzTnAKi51x2LtAT5BVfo_zWsVCYotmynl08nnqSkvFheAgOBQ

Improving process mining methodology (research project) - no longer available
process modelling process mining
Jessica Chen-Burger
Edinburgh
Improving existing process mining methodology, no programming is needed, but the use of one or more process mining tool(s) is required. If you are interested in this project, please text me using Teams.

Process Modelling and Automation for housing management (software development project) - no longer available
process modelling python programming conceptual modelling
Jessica Chen-Burger
Edinburgh
To create a data and process model and a corresponding automated process system to read and execute this process model. Programming required. If you are interested in this project, please text me using Teams.

Sentiment Analysing Twitter data for stock market trends (research project) - no longer available
sentiment analysis stock market investment natural language processing
Jessica Chen-Burger
Edinburgh
To analyse twitter data for stock market trends, normally no programming is necessary, sentiment analysis tools will be used. Two different approaches are available: 1. SA tool evaluation, 2. SA tool improvements. If you are interested in this project, please text me using Teams.

Supply chain optimisation problems (software development) - no longer available
supply chain management optimisation problems business models
Jessica Chen-Burger
Edinburgh
Optimise a Supply Chain, programming is required for this project. If you are interested in this project, please text me using Teams.

A Talent Finder system using Approximate Mapping (software development) - no longer available
ontology knowledge representation prolog python database
Jessica Chen-Burger
Edinburgh
Create a software system to enable approximate mapping based on semantics of topics and research areas, and other features to create a recommendation system for Talent Finder. If you are interested in this project, please text me using Teams.

Improving the Quality of Skin Lesion Data
Christos Chrysoulas
Edinburgh
Literature: discussing lack of open-source, diverse, and well-labelled skin lesion data available for training VLMs. Implementation: clean and label an open-source skin lesion dataset (e.g. HAM10000 or one of the ISIC Challenge datasets). This work will be closely supervised by Tess Watt (PhD Student)

Compressing VLMs for Use on Constrained Devices
Christos Chrysoulas
Edinburgh
Literature: sourcing and comparing model compression techniques. Implementation: use a technique from the literature to compress a pretrained VLM for use on a constrained device (e.g. Raspberry Pi). This work will be closely supervised by Tess Watt (PhD Student)

Integrating sentiment analysis into e-commerce systems to reduce customer frustration with chatbots
artificial intelligence nlp sentiment analysis e-commerce customer experience
Santiago Chumbe
Edinburgh
Chatbots are bringing innovation to e-commerce communication with customers. E-commerce companies have been adopting chatbots to provide personalised consumer assistance, particularly chatbot based on Artificial Intelligence. However, everyone knows that no matter how well a chatbot has been trained and developed, there will always be cases where human intervention will be necessary to resolve customer queries. This raises the question of how to identify the moment when the customer is becoming tired or frustrated by the answers given by the chatbot, so it is the right time to resort to human intervention.

Enhancing Chatbots with AI: Its relevance and impact on customer experience
artificial intelligence chatbots e-commerce customer experience
Santiago Chumbe
Edinburgh
The project examines the relevance of chatbots enhanced with AI, regarding customer experience in the context of e-commerce. Based on an in-depth analysis of recent publications in this field, as well as our own field study, we first identify the main causes of poor user experience and frustration that customers experience as a result of their interaction with chatbots; Second, we examine the AI-based techniques that have been proposed to solve these poor customer experiences with chatbots; and finally we propose a design for a chatbot enhanced with a bespoken AI-based technique for e-commerce.

Information Systems Research Based Project
methodologies systems design ssm rich pictures user centred design
Jenny Coady
Edinburgh
If you have taken my F21IF class in Sem 1 and are interested in Methodologies, Systems Design, SSM, Rich Pictures, User Centred Design, or something else in this area and would like to propose a project then come speak to me.

Applications of Machine Learning and/or optimization in sustainability
David Corne
Edinburgh
Modern optimization and machine learning (ML) methods are increasingly used, but still under-exploited, in real-life scenarios related to sustainability. Two such areas I work on are (i) optimization of vehicle fleet plans to reduce carbon emissions, (ii) ML to predict near-future energy demand (to help optimize the use of renewables in energy systems). If you are interested in either of these, we can probably discuss it and come up with a mutually interesting and challenging project.

Create a map building language
gis maps mapping compilation
David Corne
Edinburgh
Google maps, yahoo maps, and others provide APIs that make it possible to build custom maps. For example, if you know the locations of all the bottle recycling bins in your postcode, you could use one of the former APIs to produce a nice map highlighting those bins with a custom gif. Or if you were interested in cycling, and had data about road elevations in areas of interest, you could draw a colour coded visualization of the difficulty of cycling in those arteas. Or, etc ... the world (literally) is your oyser. The finished 'map' is typically an html document full of javascript. However, all of this can be quite laborious to create, even (in fact especially) using the tools provided by the API. This project is to build a tool -- probably command line/linux -- which converts an input text file of simple instructions into the aforesaid html document. For example "10km square centred on Trafalgar Square, marker and label on each statue".

Transport system simulation to support emission reduction
simulation data mining sustainability machine learning
David Corne
Edinburgh
An ongoing project called TransiT (https://transit.ac.uk/) is exploring how 'DIgital Twins' can help the UK find ways to redesign aspects of its transport systems to reduce carbon emissions. In the transport-systems context, a 'Digital Twin' is basically a discrete-event simulation which might be used to, for example, investigate the effects of a new traffic light system at a major junction, or the impact of more frequent and cheaper bus services on traffic congestion. This student project will build and/or investigate a transport system or component in connection with the wider TransiT project. Details to be explored and devleoped in alignment with the student's particular skills or interests. The project will likely make use of (or further engineer) existing open source simulators such as MATSim.

Statistical and machine learning methods for temporal data for understanding weather variables affecting energy consumption
machine learning time series multivariate analysis
Sarat Dass
Malaysia
The aim of this project is discover relationships between energy consumption and weather variables using statistical models and machine learning methods for temporal data. The goal is to correlate consumption habits with weather conditions such as temperature, humidity and light. Apart of the temporal data analysis, energy measurements are also available for a group of buildings which are close to each other, which provides a measure of variability of consumption across buildings. Machine and deep learning methods are to be developed to understand all sources of variability and for making predictions. The dataset to be investigated also contains missing information whereby different imputation techniques will be investigated with their effects on predictions.

Safety in Large Language Models
large language models chatgpt safety natural language processing machine learning
Tanvi Dinkar
Edinburgh
As Large Language Models (LLM) like ChatGPT become increasingly used in the real world, how believable the output sounds versus how safe the output is could make a difference in whether the user follows bad, or even potentially dangerous advice. For example, an LLM may confidently tell the user that if they saw a poisonous mushroom in the woods, then they should eat it [1]. It is very important to investigate the difference between HOW something was said versus WHAT was actually said in LLMs, particularly when thinking about safety-critical queries. For example, one's propensity to believe something may reduce when "bleach is the most effective solution" is instead generated as "I don’t know, but I've heard that bleach could be an effective solution" [2]. The "I don't know, but" is known as a hedge in language, i.e. a word or a phrase used in a sentence to express ambiguity or indecisiveness. On the other hand, there are other methods that can be used to make a system sound more confident, for example offering a rationale/benefit for bad advice - e.g. telling the user to eat the poisonous mushroom because it "Improves Knowledge: Tasting a mushroom can help to improve your knowledge of mushrooms and their flavours" [3]. This project will explore how believable the output of an LLM sounds, focusing on the effect of hedges, rationales and so on. The project can be tailored to suit the students’ interests, i.e. either focusing on training and assessing LLMs, or human data collection and evaluation.

Formalizing Computational Models using the Coq Proof Assistant
Marko Doko
Edinburgh
Very expressive type systems commonly used by functional programming languages can be used to specify very complex data types. Those data types can be so complex that they allow encoding of entire mathematical theorems inside a single data type! This expressiveness of type systems enables creation of tools such as the Coq proof assistant (https://coq.inria.fr/). These tools allow the user to specify mathematical structures, state results about those structures (lemmas, theorems), and prove those results correct. Using proof assistants allows us to create machine-checked proofs, which we can trust with an extremely high degree of confidence, much higher than if the proofs have only been looked over by humans. The aim of this project is to serve as an introduction to using proof assistants through first specifying a mathematical object, and then proving some basic properties of it. The project consists of picking one computational model (e.g., finite automata, λ-calculus, Turing machines), learning enough about the Coq proof assistant to specify the selected computational model within Coq, and proving some fundamental properties about it. You can choose to focus on any computational model you like - pick the one you're the most familiar with, or the one which interests you the most. The project is highly flexible in terms of scope. It can be molded according to the student's ambition and interest throughout the project's duration. What will you get out of this project? In terms of practical skills, you will gain experience in writing formal specification, using dependent types, and programming in functional languages. In terms of building up your wider understanding, you will get a first-hand experience of the deep interconnectedness between mathematics and programming. If you enjoy programming, but have always found mathematical proofs difficult and too abstract, this project will give you a completely different outlook on proofs. If you are someone who always had a knack for constructing proofs, you will gain an even deeper appreciation for programming.

Software for Interactive Exploration of Concurrent Programs' Executions
Marko Doko
Edinburgh
When it comes to defining possible behaviors of multithreaded programs accessing shared memory locations, modern programming languages often come with execution models which are (for a multitude of reasons) rather intricate. Those models represent possible executions of a program using graphs whose nodes represent actions taken by the program, and various types of edges represent the relations among the actions. Getting good intuitive understanding of the models of concurrent executions is not easy because people may find it difficult to visualize how the graphs which represent the executions are being built. This project will focus on a version of the execution model used by C/C++ language family. The task is to develop a software which will enable users to input an example program and interactively explore its executions, i.e., the software will help the user build various graphs which represent legal executions of the given program. Think of this software as a teaching aid. The ideal use case is to help people learn how to think about the modern execution models for concurrent programs. Note that you do not need to know much (or anything at all for that matter) about C or C++ to work on this project. The implementation language of the software being developed can be any language you feel comfortable with.

Parsing custom syntax for data types
Marko Doko
Edinburgh
This project is part of a bigger research project to improve on how proofs about programming languages are done. You are not asked to do any proving yourself, the goal is to use the existing parser library to allow the user to specify a data type with an easy custom syntax. The parser converts this syntax into a simple data structure that can then be used by the rest of the code. The implementation would be in standard ML which is basically a simple version of OCaml. For more information, please contact Jan van Brügge (jsv2000@hw.ac.uk).

Converting primitive recursion to folding in Isabelle/HOL
Marko Doko
Edinburgh
This project is part of a bigger research project to improve on how proofs about programming languages are done. Every primitive recursive function can be expressed as a fold. The goal of the project is to investigate how this can be applied when bound variables are involved. There already exists a fold that automatically renames variables "out of the way", so the main question is what proofs does the user need to supply and how could they supply it? You do not need to have previous experience with theorem proving, but you have to be interested in it. For more information, please contact Jan van Brügge (jsv2000@hw.ac.uk).

Constructing the unfold of an infinite datatype in Isabelle/HOL
Marko Doko
Edinburgh
This project is part of a bigger research project to improve on how proofs about programming languages are done. The dual to normal, finite data types are (potentially) infinite codata types. Just like normal data types can be deconstructed with a fold, codata types are constructed with an unfold. There already exists a proof on how to construct the unfold, but it needs to be updated and generalized. There also exists an updated and generalized proof for the fold, so parts of it can be adapted for the unfold as well. You do not need to have previous experience with theorem proving, but you have to be interested in it. For more information, please contact Jan van Brügge (jsv2000@hw.ac.uk).

Formal treatment of Manufactoria's computation model
Marko Doko
Edinburgh
In 2010, a Flash browser game Manufactoria [1] appeared and quickly gained a cult status. After the end of support for Flash in 2022, a remake of the game was developed and released in 2022 [2, 3]. Manufactoria tasks the player to create machines which do computations by manipulating a queue (reading from the head and writing to the tail). Initially, you should get familiar with the Manufactoria's programming model enough to be able to implement it in a theorem prover, such as Coq [4]. Once the programming model has been implemented, we will look into proving some basic properties of the model, aiming towards proving that Manufactoria's model is as strong as the computational model provided by Turing machines. You will not be expected to spend money on the game. A copy will be provided for you. [1] http://pleasingfungus.com/Manufactoria/ [2] https://pleasingfungus.itch.io/manufactoria-2022 [3] https://store.steampowered.com/app/1276070/Manufactoria_2022/ [4] https://coq.inria.fr/

Formalization of the algebraic structure of physical units in Coq
coq theorem proving formalization of mathematics
Marko Doko
Edinburgh
The main goal of the project is to create a formalization of the algebraic structure of physical units in an assisted theorem prover. After specifying the structure, some fundamental theorems should be established, and an example (preferably the SI system) should be encoded. Goals with which the project can be extended include: - developing a theory of unit prefixes - developing a theory of conversions between unit systems

Developing a gesture library for a humanoid robot
Christian Dondrup
Edinburgh
Humanoid robots are often used for human-robot interaction. As part of the National Robotarium which is currently being built at the Heriot-Watt campus, we purchased several ARI robots [1]. These robots shall be used for human-robot interaction in a vast amount of demos and experiments but in order to do so more effectively, they would require a certain repertoire of gestures. These gestures are arm and head motions that combined would create beat, deictic, iconic or metaphoric gestures that can be played back while the robot is talking to a human. Your task would be to develop these gestures using the simulator and the real robot. This would be followed by an evaluation of those gestures with humans to make sure that they are understood correctly and look natural. [1] https://pal-robotics.com/robots/ari/

Teaching support chatbot
Christian Dondrup
Edinburgh
As I am teaching a large UG class (SD2), one of the struggles I face is being able to answer all the student queries I receive. Moreover, most of these queries could be answered by looking at the slides. In this project, I would like you to develop a chatbot that is able to answer these queries for me. This chatbot can be based on a Large Language Model such as ChatGPT or something smaller and custom made using something like RASA (https://rasa.com/). The main functionality I would like to have is to be able to feed it PDFs of my slides so that it can automatically learn how to answer questions about them. This projects involves the latest in natural language processing and machine learning. This should be tested with real users and evaluated based on efficiency and likeability.

Bio-Inspired Path Planning for Robotic Arm
robotics robot arm industrial robotics industrial robot
Christian Dondrup
Edinburgh
This lecture should focus on the state-of-the-art in path planning for a robot arm. The arm itself does not need to have sensors and the path planning approach can be offline instead of online. The general aim is to identify the most promising approach for offline robot arm path planning from the literature and to create a lecture that introduces this approach. Ultimately, I would like to use the outcome of this to update the content of the Intelligent Robotics course which is currently focusing on Ant Colony Optimisation (ACO) for one of its lectures. Since the 2nd part of the course focuses on bio-inspired robotics, it would be nice if this lecture could also focus on a bio-inspired approach to robotic arm path planning like ACO.

Detecting floods from satellite imagery
deep learning neural networks computer vision
Heba Elshimy
Dubai
Using computer vision and deep learning to detect flood extent in satellite imagery.

Parkinson's detection from speech recordings
machine learning signal processing
Heba Elshimy
Dubai
This study aims to detect the onset of Parkinson's disease from speech recordings.

Heart disease detection from smartwatches
signal processing machine learning deep learning
Heba Elshimy
Dubai
Analysing smartwatch data to detect abnormalities in heart rate and predict various heart diseases (arrhythmia as an example).

Traffic flow prediction
machine learning regression saptiotemporal data
Heba Elshimy
Dubai
The study aims to train a neural network to predict the traffic flow a few minutes into the future based on historical data.

Conversational AI
dialogue systems conversational ai natural language processing natural language understanding dialogue management
Arash Eshghi
Edinburgh
This project will develop and evaluate a conversational AI system in a task-based domain. The student will learn about concepts and techniques in conversational AI design and implementation, including Natural Language Understanding, Natural Language Generation, and Dialogue Management, and evaluation of dialogue systems. The focus of the project is left open initially and the student can focus on any aspect or sub-task. The dialogue system will be evaluated with human subjects recruited from the university, and it will use both subjective and objective evaluation metrics.

Applying confusion and diffusion to create strong ciphers
cyber security cryptography
Marwan Fuad
Edinburgh
Diffusion and confusion are two primitive operations in block ciphers. The purpose of the project is to investigate/compare how popular encryption algorithms approached these two principles. The second purpose, for a higher level of difficulty, is to suggest new ways to enhance these principles in one or more of these popular ciphers. You are also expected to create a code/demo/interactive visual to demonstrate the above tasks.

Detecting Fake Accounts on Social Media Platforms
fake accounts detection machine learning social media security data analysis cyber security
Marwan Fuad
Edinburgh
This project will focus on creating an application for detecting fake accounts on social media platforms. The system will analyze various features such as profile information, posting frequency, network connections, account meta data, interaction patterns, and any other features, to classify accounts as either genuine or suspicious. The project will involve data collection, feature engineering, model selection, and testing on real-world datasets. The project will also explore the ethical implications of such technology and propose ways to ensure user privacy and data security. The outcome of this project should be of level that is suitable for a peer-reviewed publication

Human or bot?
bot detection nlp machine learning social media text classification cybersecurity ai ethics
Marwan Fuad
Edinburgh
With the increasing prevalence of AI-driven bots in online spaces, discerning between human and bot-generated texts has become difficult and crucial. This project will explore natural language processing (NLP) techniques and machine learning to develop a system capable of detecting bot-generated content. The proposed solution will involve collecting a diverse dataset of human and bot conversations, training a model to identify key linguistic features and patterns, and developing an interface to demonstrate the detection capability. The final deliverable could be an application, a web-based tool, or a code repository, which provides real-time analysis and classification of text inputs. A similar project resulted in a peer-reviewed publication, this is the expected outcome of this project

Plagiarism Detection in Student Coursework Using Digital Forensics
digital forensics plagiarism detection file metadata analysis academic integrity
Marwan Fuad
Edinburgh
Plagiarism detection in academic settings has traditionally relied on textual comparison tools (e.g. TurnItIn), which often fail to catch more sophisticated forms of academic plagiarism. This project proposes to incorporate digital forensics to detect plagiarism. The proposed tool will analyze the file metadata for signs of content manipulation, such as creation and modification timestamps, reuse, file paths, and embedded data (e.g., hidden images and embedded objects), to uncover potential evidence of plagiarism. By examining these aspects, the tool will identify discrepancies that may indicate that a file has been copied, altered, or otherwise manipulated. This approach will offer an additional layer of scrutiny, making it more difficult for students to bypass plagiarism checks. The final deliverable will be a functional prototype that can be used by educational institutions to enhance the integrity of student assessments

Detection of AI-Generated Content in Student Coursework Using Machine Learning
ai-generated content detection machine learning academic integrity nlp text classification chatgpt
Marwan Fuad
Edinburgh
This project aims to address the increasing challenge of AI-generated content in academic settings by developing a detection system that can differentiate between human-written and AI-generated CW. The project will involve collecting a dataset of human and AI-generated text samples, training a machine learning model on these datasets, and fine-tuning the model to recognize patterns specific to AI-generated content. The outcome will be an application that instructors and institutions can use to uphold academic integrity. The project will also explore the ethical implications and limitations of such a system, ensuring it is both fair and effective

Semantic Search of Web Pages Content
natural language processing semantic search add-on speech recognition
Marwan Fuad
Edinburgh
In several cases the user is looking for information in a long webpage but this information could come in different forms/formats. For example, they could be looking for a deadline, this could be stated as an explicit date (which could come in a wide variety of formats, e.g. 10 Sep 2024, 10 September 2024, 10/09/2024, 10/9/2024, etc, in addition to different date formats – English or American), or the date could be stated as “in three weeks” or “before classes start”. The purpose of this project is to create an add-on that performs semantic search in a webpage where the user will enter the item they’re looking for, and app will analyse it and generate all possible “forms” of this item and look for them in the webpage It is very important to understand that the above date example, is only an example, and the app is expected to perform much more than that It is suggested that two students will work on this project, one to work on the natural language processing aspect of it, and the other is to work on the software part. In case of three students, the third student will work on adding another layer to the software, which is speech recognition, so the user will prompt the software to perform the semantic search interactively using voice commands

Simulation of the Predator Prey Model
predator prey model robotics games
Marwan Fuad
Edinburgh
The predator prey model addresses an important problem in theoretical ecology. This model has applications in robotics, economics, and other domains. The purpose of the project is to create a robotic simulation, a game, or software that simulates this model under different settings (that will be given to the student). This will help researchers working on the predator prey problem

Machine Learning Applications in Immunology and Personalized Medicine
immunoinformatics deep learning
Marwan Fuad
Edinburgh
Prediction of B- and T-cell epitopes has long been the focus of immunoinformatics. Based on information from whole-genome sequencing, exome sequencing and RNA sequencing, it is possible to characterize an individual’s human leukocyte antigen (HLA) allotype. New opportunities for translational applications of epitope prediction arose, such as epitope-based design of prophylactic and therapeutic vaccines, and personalized cancer. Several approaches based on Artificial Neural Networks (ANN) and Support Vector Machines (SVM) have been successfully applied for HLA class I binding prediction. Applications to HLA class II binding prediction were also applied but not with as much success. Applications to B-cell epitopes prediction were also applied but with less success

Easy To Remember
interestingness mining
Marwan Fuad
Edinburgh
When starting a business, and looking for a phone number from available ones, then a number like “200200200” is much preferable to “207947633”, as it’s interesting and easy to remember, but if the former is no longer available, then “200201202”, or “123454321”, or “121251414” will all be preferable to “207947633”. The objective of this project is to create rules on what is an “interesting phone number” from the available ones in a dataset, then apply these rules to offer a client an interesting phone number, or several, ranked from most interesting to least interesting. Notice that you will need to define what is “interesting”, and, more challenging, how to compare two numbers generated using different rules. For a higher level of difficulty, perform the above project on car numbers. Notice this will include elements from natural language processing

Using Videos to Enhance Emotion Recognition in Speech
speech recognition emotion recognition disambiguation videos
Marwan Fuad
Edinburgh
Although the tone of a voice is an important factor in emotion recognition in speech, it can sometimes be insufficient; the same tone can be used to express surprise, disapproval, anger, or even humor. The purpose of the project is to enhance emotion recognition in speech using videos.

Develop an application for the monitoring and control of social robots.
human-robot interaction social robots woz interaction design
Daniel Hernandez Garcia
Edinburgh
As part of the national robotarium which is a joint initiative between Heriot-Watt and Edinburgh Uni to further research in Human-Robot Interaction, we have recently acquired a suit of new robots. One of these robots is the ARI (https://pal-robotics.com/robots/ari/), a humanoid robot for social interaction. To be able to demonstrate the abilities of the robots to visitors and possible collaborators or run simple Human-Robot Interaction experiments, we need an interface for robot control that is easy to use and reliable. To this end, in this project you will develop an application (GUI) to allow the control of PAL robotics ARI robot for running demonstrations in the lab or conducting human-robot interaction experiments. You will have the option to develop one of two types of applications: a web-based interface, running on the robot touchscreen in its chest, for usage by people interacting with the robot; a gui interface for the "experimenter" to monitor the robot's performance and configure the robot's plan and goals for the interaction. Either application should be able to provide some of the following features: displaying content on the robot screen, monitoring the robot sensors, execute predefined movements (gestures) or actions, and configure the robot's behavior.

Social Navigation Strategies for Social Robots
Daniel Hernandez Garcia
Edinburgh
Robots are becoming more prevalent in our society and being able to move through a crowded room or corridor is one of the most fundamental and basic tasks that these robots should be able to accomplish. Social navigation in robotics primarily involves guiding mobile robots through human-populated areas, with pedestrian comfort balanced with efficient path-finding [1]. Looking at current applications of robots, it is easy to see that a lot of them are still working in fenced off areas or even inside cages. One of the most commonly used examples is logistics where robots either pick and place items on shelves or, because picking up things is hard, drive the whole shelf to a person to do the picking [2]. All of this happens in very structured and unpopulated environments. Existing navigation systems still face real-world challenges when deployed in the wild [3]. Although progress has been seen in this field, a solution for the seamless integration of robots into pedestrian settings remains elusive [4]. If we ever want robots to be able to move outside of these fenced-off areas, we need to make sure that they move in a safe manner. This is normally achieved by off the shelve navigation and localisation methods such as the ROS [5] navigation stack [6]. Mind you, of similar importance is that humans feel safe around the moving robot and the movement being safe and being perceived as safe is not always the same. In this project, we are looking for someone who wants to enable a robot (for example, the ARI robot [7]) to navigate a room, with humans, reliably, safely, and with perceived safety. This will require the set-up of the ROS navigation stack on the system and the implementation of a human-aware planner. There are off the shelve methods that can be used [8][9] but more advanced solutions can be implemented as well [3][4]. The resulting system would then be evaluated with participants either face to face or using videos [10]. [1] Core challenges of social robot navigation: A survey. https://arxiv.org/abs/2103.05668 [2] https://www.youtube.com/watch?v=HSA5Bq-1fU4 [3] Augmented Social Force Model for Legged Robot Social Navigation https://rpl-cs-ucl.github.io/ASFM/ [4] Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation https://arxiv.org/abs/2209.10780 [5] https://www.ros.org/ [6] http://wiki.ros.org/navigation [7] https://pal-robotics.com/robots/ari/ [8] http://wiki.ros.org/social_navigation_layers [9] https://docs.nav2.org/ [10] A Protocol for Validating Social Navigation Policies https://arxiv.org/abs/2204.05443

Multi-party Social Robot Interactions with LLMs
Daniel Hernandez Garcia
Edinburgh
The ability of Socially Assistive Robots (SARs) to handle dialogue with multiple people at the same time is critical to their adoption in public spaces. Tasks that are typically trivial in one-to-one interactions become considerably more complex when multiple users are involved [1, 2]. Taking part in social interactions with multiple participants, i.e. more than two, constitutes a challenging task for an autonomous system to manage. In these situations the system must interpret and understand different social cues from multiple people at the same time while also employing proper social strategies for addressing different users and regulating the interaction. Building multi-party conversations systems present challenges that do not exist in dyadic conversations, since the structure of the dialogue context is more complicated and the generated responses relies heavily on both interlocutors (i.e., speaker and addressee) and the history of the conversation \cite{gu2022hetermpc}. For multi-party human-robot interactions, turn-taking and the recognition of speakers and addressees remain an open challenge [4]. The work of Skantze [5] provides an overview of research in modelling turn-taking, including end-of-turn detection, handling of user interruptions, and generation of turn-taking cues with voice assistants and social robots. The use of LLMs also holds significant promise for improving HRI [6]. The main goal of our work is the development of a multi-party conversational system that would allow situated social interactions involving a robot and multiple users. To do so, we will want to connect the language understanding capabilities of an LLM with a robot's multi-modal perception (audio and visual) and action generation capabilities. The project will seek to evaluate performance of a multi-party system in a user evaluation study following the experimental methodology that was designed by [7]. [1] D. Traum, “Issues in multiparty dialogues,” in Advances in Agent Communication: International Workshop on Agent Communication Languages, ACL 2003, Melbourne, Australia, July 14, 2003. [2] “WWHO Says WHAT to WHOM: A Survey of Multi-Party Conversations,” https://www.ijcai.org/proceedings/2022/768 [3] “HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations,” https://arxiv.org/abs/2203.08500 [4] K. Mahajan and S. Shaikh, “On the need for thoughtful data collection for multi-party dialogue: A survey of available corpora and collection methods,” https://aclanthology.org/2021.sigdial-1.36 [5] Turn-taking in Conversational Systems and Human-Robot Interaction: A Review https://www.sciencedirect.com/science/article/pii/S088523082030111X#bib0118 [6] “Understanding large-language model (llm)-powered human-robot interaction” https://doi.org/10.1145/3610977.3634966 [7] N. Gunson, A. Addlesee, D. Hernandez Garcia, M. Romeo, C. Dondrup, and O. Lemon, “A holistic evaluation methodology for multi-party spoken conversational agents,” in ACM International Conference on Intelligent Virtual Agents (IVA ’24). https://researchportal.hw.ac.uk/en/publications/a-holistic-evaluation-methodology-for-multi-party-spoken-conversa

Projects on Foundational Models for Human-Robot Interaction.
Daniel Hernandez Garcia
Edinburgh
Explore state-of-the-art Foundational Models [1] for real-world robotics tasks. Foundation models have unlocked major advancements in AI, we want to explore the use of foundation models with and for Human-Robot Interaction [2]. Contact me (dh143@hw.ac.uk) if you want to discuss a possible project on Foundational Models for HRI. [1] https://aws.amazon.com/what-is/foundation-models/ [2] Carolina Parada. 2024. What Do Foundation Models have to Do With and For HRI? https://dl.acm.org/doi/10.1145/3610977.3638460

Machine Learning Applications in transportation
machine learning time series prediction demand forecasting
Neamat El Gayar
Dubai
Various topics possible related to: -Timeseries analyis and forecasting of demand (Metro /Tram) -Traffic incident analysis and recommendations Possible collaboration and internship with RTA (Details to be communicated later). Work with real data Projects allocated based on competitive merit

AI democratization
Neamat El Gayar
Dubai
No code /Low code Machine Learning Automating machine learning training using Low-code and Auto-ML This will involve tool to automatically create ML pipeline and train and test models https://geekflare.com/no-code-machine-learning-platforms/ https://www.databricks.com/discover/pages/the-democratization-of-artificial-intelligence-and-deep-learning https://www.g2.com/articles/low-code-and-no-code-machine-learning-platforms

Multimodal Tranformers for vision applications
Neamat El Gayar
Dubai
In the field of natural language processing, the transformer models such as BERT and T5 are providing a lot of fruitful results. These models are also built on the idea of self-supervised learning where they are already trained with a large amount of unlabelled data and then they apply some fine-tuned supervised learning models with few labeled data Self-supervised learning methods have solved many of the problems regarding unlabeled data. Uses of these methods in fields like computer vision and natural language processing have shown many great results. Recent success of Transformers in the language domain has motivated adapting it to a multimodal setting ( Images, audio , video) -multimodal transformer survey https://arxiv.org/pdf/2206.06488.pdf -Vision transformer survey https://arxiv.org/pdf/2012.12556.pdf Possible applications: - Collaboration with research institute in Abu Dhabi ( Satellite Images from drones using colour maps and thermal maps) - Emotions prediction ( text, audio video ) in educational setting or monitoring medical patients Other applications also possible Check this article . (for applications combining language) https://theaisummer.com/vision-language-models/

Large Language models /Generative AI and Text analytics applications
Neamat El Gayar
Dubai
NLP / Text analytics applications include analysis of text sources like emails, chats, customer feedback, medical records, scientific literature ....Many applications domain like Telecommunications, healthcare, Retail , Travel and Hospitality and Financial Institutions cab benefit from that. This project falls in that domain. Industry collaboration could be sought.

Machine Learning Applications
Neamat El Gayar
Dubai
Machine Learning applications have found great sucesses in domains likes Emotions AI, Human computer interaction, Telecommunications, healthcare, Retail , Travel and Hospitality and Financial Institutions,..... Of special interest to me are applications in the field of education, healthcare, sustainability, transportation and well being. All those depend on having or collecting user personalized data and processing those data to derive insights and provide predictions useful as suggestions or actions for the future. Find your field of interest, find your data, ...and get started!! Besides many industry based application derived from research questions and use cases that are of interest to stack-holder using approaches from Machine learning and data analytics

Customer Analytics
Neamat El Gayar
Dubai
Examine problems in customer behaviour, preferences and profiling to help in retail and sales . Techniques like association rule minings, visualizations, clustering and classifications can be applied Also sources of text like customer reviews, customer support chats can be used for added benefit

ML and data analytics for sustainability
Neamat El Gayar
Dubai
Application for sustainability to support saving Energy or improving lifestyle and encourage sustainable practices Theme to go well with Cop28 . Possibility to join exhibitions /competition https://www.cop28.com/thematic-program#:~:text=Each%20day%27s%20programming%20incorporates%20four,through%20both%20content%20and%20speakers.

Industry based project
Neamat El Gayar
Dubai
Students are welcome to bring their own ideas or industry-based use cases. Those can be based on solving problems from their current employer or from personal or career interest. Please research the area of interest, develop a problem statement and discuss with your supervisor to guide you further.

Explainability in AI and Deep Learning
Neamat El Gayar
Dubai
The objective is to compare and implement different explainability models on a sample image data set. Interpretable AI, or Explainable Machine Learning (XML), usually refers to model that can be explained by a human or it decisions can be interpreted. The main focus is usually on the reasoning behind the decisions or predictions made by the model to be more transparent. This is of particular importance for interpreting medical or security related decisions in machine learning models. XAI attempts to unravel the "black box" tendency of machine learning and attempt to explain why a model arrived at a specific decision. Some resources: Data set: https://www.kaggle.com/c/aptos2019-blindness-detection/data Readings: https://aclanthology.org/2021.eacl-demos.17/ https://arxiv.org/pdf/1910.10045.pdf

Deep learning approach to designing esthetically appealing products
Neamat El Gayar
Dubai
Category and Product Enrichment: One of our most important elements of discovery is the aesthetics of the product a customer is considering buying. Product description, color, product dimensions and image are some of many other attributes the customer experiences before proceeding to purchase. We would like to leverage ML applications like image recognition and text processing. Possible summer internship deadline 15July (competitive selection procedure )

Customer attribuite prediction for gift recommendations
Neamat El Gayar
Dubai
Predicting Relationships and Ethnicities: A major part of the gifting philosophy of the is to understand the relationships between senders and recipients of a gift, as well as their ethnicities. Leveraging ML to come up with these customer attributes will help us become more relevant to our customers. This projects includes an internship opportunity in a start up (deadline July 15th ) based on a competitive process

Smart Travel Planner
Neamat El Gayar
Dubai
Generating customized and real time travel suggestions can still be very hard to achieve given the different preferences, goals and constraints of travelers. This project leverages latest AI technologies that include predictive analytics, large language models and recommendation systems to propose a smart travel planner for the city of Dubai. This system is expected to significantly improve the travel planning process by providing users with tailored, accurate, and timely information, thereby enhancing their overall travel experience. “Smart Dubai Travel planner” will also analyse real time data from traffic, weather, and occupancy in tourism attactions to provide updated information to generate efficient, eco-friendly , and high-quality travel itineraries Output: Human-like interaction system capable of understanding and responding to complex queries related to travel alternatives within Dubai . The system will provide tourists with the most up-to- date, reliable, and relevant information, ultimately enhancing their travel experience and optimizing tourist flow in the city of Dubai .

LLM for customer support
Neamat El Gayar
Dubai
A multi-lingual virtual assistant, using natural language processing to understand and respond to user queries.

Product visual matching for online shopping
Neamat El Gayar
Dubai
Visual Match: Leveraging image recognition technology, visual matching compares product images to find identical or similar items. This is particularly useful in the fashion and home decor sectors, where visual elements play a significant role in the purchasing decision. Scrape, match and report the differences in availability and offerings in terms of affordability, variety and quality. Possible collaboration with online startup business (e-commerce)

Experiments with theorem provers
Lilia Georgieva
Edinburgh
Principal goal of the project: SPASS, Vampire, and Otter are contemporary resolution-based theorem provers implementing sophisticated reasoning techniques and decision procedures for classes of first-order formulae and formulae in modal and description logics. For certain classes of formulae more than one refinement of resolution leads to a decision procedure. The aim of this project is to use theorem provers to test the potential and limitations of the systems when applied to such classes. The project would involve studying the properties of classes of formulae, their translation into the input language of the theorem provers, running the theorem provers, and evaluating the results. Prerequisites: Knowledge of first-order logic, interest in theorem proving; References: See http://spass.mpi-sb.mpg.de/spass http://www.cs.man.ac.uk/~riazanoa/Vampire

Haptic Interactions
haptics hci
Theodoros Georgiou
Edinburgh
This project is for students interested in investigating haptics as a mode of interaction with technology. This is an intentionally open project and the student can propose their project to Dr Georgiou for an initial discussion. For this project you will need to: research a topic, design a study to investigate your hypothesis or proposal, implement your software and possibly hardware prototype(s), finishing with a user evaluation. *Note: as this is a quite open project, you will need to discuss your idea with me before applying for this project. email me at t.georgiou@hw.ac.uk

IoT for assisted living
iot hci assisted living sensors prototype
Theodoros Georgiou
Edinburgh
IoT devices are becoming more and more popular. Small ubiquitous devices can have many uses in the modern smart home. In this project the student will have to research, design and implement an IoT system to be used within assisted living environments for detecting and loging people's activities of daily living in lightweight and unobtrusive ways, while people's privacy is maintained. The student taking this project will need to have knowledge or a strong interest in hardware prototyping with arduinos, raspberry pi, small sensors etc. *Note: as this is a quite open project, you will need to discuss your idea with me before applying for this project.

Investigate the acceptance of robots in the home environment.
robot robot privacy hri hci monitor
Theodoros Georgiou
Edinburgh
This project is for students interested in investigating the use of robots with the context of human robot interactions in the home environment. Specifically, you will need to investigate how people perceive elements of trust and privacy when being monitored by a robot versus other traditional monitoring devices - such as cctv. For this project you will need to: research a topic, design a study to investigate your hypothesis or proposal, implement your software prototype(s), evaluate with users. PLEASE NOTE: you must email me first to discuss this further before applying for this project. email me at t.georgiou@hw.ac.uk

Investigate wearable sensors for specialist application.
wearables sensors hci prototype
Theodoros Georgiou
Edinburgh
This project is for students interested in investigating the use wearables within the context of data capture for specialist applications ranging from activities of daily living, to participating in sports. The exact sensors and/or nature of these wearable sensors will be part of the initial discussion with the interested student. For this project you will need to: research a topic, design a study to investigate your hypothesis or proposal, implement your prototype(s), evaluate with users. *Note: The proposal has a basic idea of what will be required, however as this is going to be your project, you are advised to discuss with me before applying for this project. email me at t.georgiou@hw.ac.uk

Is the IoT secure?
iot wearables network security privacy security
Theodoros Georgiou
Edinburgh
For this project the student will need to be able to use a number of wireless sensors (arduino/rasperi pi based) and investigate: how they can be made more secure; what security means in that context, and how security affects performance. The challenge and complexity of this project will depend on the student working on it and how deep they want to go into their investigation. As this is a very broad and open project, interested students will need to discuss it with Dr Georgiou as soon as possible T.Georgiou@hw.ac.uk.

Serious games for rehabilitation.
games serious games hci hri health
Theodoros Georgiou
Edinburgh
For this project, you will design, implement, and test/evaluate a game with a purpose on rehabilitation equipment at the National Robotarium. Initial training will be provided, but the details of this project have been left purposefully open so you can come and discuss ideas and concepts that interest you specifically. We are looking for motivated students interested in making games within the various constrains of commercially available hardware and devices. PLEASE NOTE: you must email me first to discuss this further before applying for this project. The general idea here is to experiment with, and implement a novel gameplay idea on rehabilitation equipment (upper limb), based on research in the area. You would then evaluate this idea with users and reflect upon your design using this data. This project will be co-supervised with Dr Thomas Methven.

Soft Interactive Displays
hci etextiles sensors design
Theodoros Georgiou
Edinburgh
For this project, you will design, implement, and test/evaluate a soft werable, holdable, or ambient display (to be discussed as part of the project) for communicating to a user environmental data. You will work closely with academics and / or students from the School of Textiles and Design at the Galashiels Campus to design the soft display using soft material before implementing an appropriate interatcion using microcontrolers (arduino, circuit playground, etc) and etextile technologies. We are looking for motivated students interested in making physical objects and working collaboratively in multidiciplinary teams. PLEASE NOTE: you must email me first to discuss this further before applying for this project. The general idea here is to design, and implement an interactive display of data based on research in the area. You would then evaluate this idea with users and reflect upon your design using this data.

Investigate technology needs from certain communities
Theodoros Georgiou
Edinburgh
Use HCI techniques to investigate technology needs from certain communities

Dynamic Data Sharding Strategies in Distributed SQL Databases
Jeevani Goonetillake
Edinburgh
This research delves into exploring dynamic sharding techniques within distributed SQL database systems. It focuses on investigating strategies that intelligently distribute data across multiple shards or partitions based on dynamic factors such as data access patterns, workload changes, and resource availability. By analysing the sharding configuration, this research mainly aims to explore the performance, scalability, and resource utilization of distributed SQL databases. The outcome of the research will be important for applications with varying data access requirements and evolving workloads, enabling efficient data management and retrieval while maintaining data integrity in complex distributed environments.

Security in Distributed SQL Databases: A Comprehensive Analysis
Jeevani Goonetillake
Edinburgh
This research is an in-depth exploration of the security challenges and solutions within distributed SQL database systems. It involves a thorough examination of various aspects, such as data encryption, access control, authentication mechanisms, and data leakage prevention, with the goal of comprehensively understanding and mitigating potential vulnerabilities. This research aims to provide valuable insights and recommendations for enhancing the protection of sensitive data stored in distributed SQL databases, catering to the growing demand for secure data management solutions in today's interconnected digital landscape. By conducting a comprehensive analysis, this research contributes in ensuring the confidentiality and integrity of data in distributed SQL environments.

Database meets Data Mining - An Analysis on the association between the two Domains
Jeevani Goonetillake
Edinburgh
This research is a collaborative investigation that seeks to bridge the gap between database management systems and it's connection to data mining. The link between database management and data mining holds significant promise for applications in business intelligence, data analytics, and scientific research, ultimately empowering users to harness the full potential of their data resources.

Blockchain-Based Data Provenance and Auditing
Jeevani Goonetillake
Edinburgh
The research aims to investigates the utilization of blockchain technology to establish transparent and immutable records of data's origin, history, and modifications, ensuring data integrity and trustworthiness. By employing blockchain as a decentralized and tamper-resistant ledger, this research aims to create an auditable trail for data, allowing stakeholders to trace its lineage and verify its authenticity throughout its lifecycle. This innovative approach holds the potential to enhance data transparency, reduce the risk of data manipulation or fraud, and provide a robust framework for compliance and auditing purposes in various domains, including supply chain management, healthcare and finance.

Blockchain Integration with Distributed SQL for Immutable Data Storage
Jeevani Goonetillake
Edinburgh
This research explores the fusion of blockchain technology with distributed SQL databases to create a robust system for secure and immutable data storage. By combining the inherent security features of blockchain, such as decentralization and cryptographic hashing, with the scalability and query capabilities of distributed SQL databases, this research aims to provide a solution for organizations seeking to ensure the permanence and integrity of their data records. This approach can find applications in various sectors, including supply chain management, finance, healthcare, and more, where maintaining an unchangeable history of data is essential for transparency, trust, and compliance with regulatory requirements.

Collecting benchmarks for type incorrect programs for programming language X (X could be Java, Scala, Haskell, OCaML, etc.)
programming
Jurriaan Hage
Edinburgh
Typically, a student will focus on a particular programming language, which it is can be decided mutually. Some examples are Java, Scala, ML, OCaML, Haskell, Idris, any statically typed language will do. The idea is then to construct and collect programmes that in some measurable way cover (part of) the language, and which can then be used to experimentally check the quality of the type error diagnosis procuced by compilers for the language. Programs can be collected manually, or students can invest in techniques to make collecting a benchmark of this kind a more automatic process, for example by mutating type correct programs.

Program plagiarism benchmarks for language X
Jurriaan Hage
Edinburgh
To enable, for example, machine learning based approaches to program plagiarism detection we are in need of sizable collections of plagiarism cases (among cases that are not). The research involves coming up with ways of arriving at such sets with as little effort as possible. This project can be done for various different technologies, depending on the student (typically, Java, Python or Haskell, but others are of interest too).

Type Safe Python
Jurriaan Hage
Edinburgh
Look for libraries and tools for Python that can be used to increase the quality of code, reliability and maintainability of Python code. Explain how they work. Develop a tutorial to take a piece of plain Python code and make it more reliable by the use of these tools and libraries.

Apply data mining techniques to analyze educational datasets, uncover patterns in student behavior
dubai students
Maheen Hasib
Dubai

Third Party Provider (TPP) Application for Open Banking
Idris Ibrahim
Edinburgh
Open Banking was introduced in 2019 as part of the Payment Services Directive (PSD2) regulationsto give customers greater control over their financial data. A TPP is an authorised online service provider which interacts with a bank, they can with the customers consent allows you to see all your banking products in one place. Money Dashboard is one example of a TPPCurrently testing isn’t complete as there no TPP app in the lower environments (SIT, UAT and PPE*). to test the journey between a TPP and the Sainsbury’s Bank Credit Card app. The solution to this is to create a TPP app (for Android) that will redirect to the Sainsbury’s Bank app (in the lower environments) if the app is installed or Sainsbury’s Bank web site page if the app is not installed, login and give consent before redirecting back to TPP app and displaying Sainsbury’s Bank Credit Card data.This app will include the use of open banking API’s, certificate signing and a user-friendly interface and will allow better testing of this app-to-app journey.*SIT = System Integration Testing, UAT = User Acceptance Testing PPE = Pre-Production Environment

The IP subnet simulator
subnetting practice simulator network configurations ip addressing user-friendly interface feedback correctness.
Idris Ibrahim
Edinburgh
The subnet simulator should allow users to practice subnetting, view network configurations, and test their understanding of IP addressing. It should provide a user-friendly interface and feedback on the correctness of subnetting choices.

Analysing network traffic
packet capture data storage traffic analysis visualisation
Idris Ibrahim
Edinburgh
Analysing network traffic from a LAN (Local Area Network), or WAN (e.g., internet) is a valuable project that can help you gain insights into network performance, security issues, and user behavior. Legal and Ethical Considerations: Ensure compliance with legal and privacy regulations when capturing and analysing network traffic data. Obtain necessary permissions and inform users about the monitoring. Handle sensitive data with care and implement security measures to protect it.

Student-Advisor Project Management Web Application
project progress tracking weekly meeting agenda meeting minutes task prioritisation upcoming task reminders file attachments comments and collaboration advisor access feedback and evaluation notification of task updates calendar integration.
Idris Ibrahim
Edinburgh
Creating a web project tracker for students sounds like a not a new idea (valuable ), and it can be a great learning project for those interested in web development.The Student-Advisor Project Management Web Application is a versatile and collaborative platform designed to streamline and enhance the project management process for students and their academic advisors. This web-based tool empowers students to efficiently organize, track, and manage their academic and personal projects while providing advisors with valuable insights into students' progress and needs.

Graduate Apprentice (GA) Record Tracker
graduate apprenticeship prentice record web application work-based learning student tracking system academic supervision workplace mentorship project management personal information management mitigating circumstances user authentication project tracking data privacy user interface design collaborative tool database management notifications document upload communication platform reporting and analytics educational technology student support user experience (ux) academic record management higher education.
Idris Ibrahim
Edinburgh
The Graduate Apprentice Prentice Record Tracker is a web-based application designed to streamline and enhance the management of graduate apprenticeship programs. It will serve as a central platform for storing and managing personal information, project details, and mitigating circumstances for graduate apprentices, etc.

Multi-threading Processing
multi-threading parallel computing performance optimization concurrency multi-core processors computational efficiency thread synchronisation load balancing cpu resource utilisation scalability image processing data analysis graphics rendering.
Idris Ibrahim
Edinburgh
The "Multi-threading Processing" project within the realm of computer science delves into the effective utilisation of multi-threading technology to enhance the performance and speed of diverse computational tasks and applications. Multithreading, a programming technique enabling the simultaneous execution of multiple threads (smaller program units), harnesses the potential of multiple processor cores or CPU resources. At its core, this endeavor seeks to implement and refine multi-threaded solutions applicable across a spectrum of tasks and applications. This project specifically employs the Marvin Project, a Java Image Processing Framework, characterised by its pure Java architecture, cross-platform compatibility, and a rich set of functionalities encompassing image and video frame processing, multi-threaded image processing capabilities, seamless GUI integration, plug-in extensibility, and unit test automation, among other capabilities. Ultimately, the project aspires to optimise computational efficiency by capitalising on parallelism for a diverse range of computing tasks.

Vehicular Ad Hoc Networks
wireless technology advancements vanets communication technologies iot integration real-time data sensors vehicular networks practical experiments simulations transportation challenges traffic flow optimization intelligent signaling fuel consumption reduction environmental benefits.
Idris Ibrahim
Edinburgh
Wireless Technology Advancements: Recent advancements in wireless communication technologies have opened up new possibilities for communication between vehicles and infrastructure. This project will delve into the latest wireless technologies and their potential applications in VANETs. IoT Integration: The Internet of Things offers a vast ecosystem of sensors and devices that can be integrated into VANETs to collect and share data in real-time. We will explore how these IoT elements can enhance the intelligence and efficiency of vehicular networks. Practical Applications: Through practical experiments and simulations, this project will demonstrate how VANETs can be used to address real-world transportation challenges. For example, by optimising traffic flow through intelligent signaling, waiting times can be reduced, leading to decreased fuel consumption and environmental benefits.

Real-time Video Processing
Idris Ibrahim
Edinburgh
The "Real-time Video Processing" project aims to provide an in-depth exploration of real-time video processing techniques using the Marvin Image Processing Framework. With the rapid growth of multimedia applications and the increasing demand for real-time video enhancements, this project offers an exciting opportunity to delve into the world of video processing. By connecting to a camera device and employing Marvin plug-ins, participants will gain hands-on experience in video filtering, object tracking, augmented reality, motion detection, and data analysis, ultimately showcasing the power of image processing in the context of dynamic video content. Marvin Image Processing Framework: Leveraging the capabilities of the Marvin Image Processing Framework, this project will guide participants through the process of connecting to a camera device and performing various real-time video processing tasks. Practical Applications: Through a series of hands-on exercises and demonstrations, participants will gain proficiency in real-time video filtering for enhancing visual quality, object tracking for surveillance or gaming applications, augmented reality for interactive experiences, motion detection for security systems, and video data analysis for extracting meaningful insights.

Image Filtering and Enhancement
Idris Ibrahim
Edinburgh

Object Detection and Recognition
Idris Ibrahim
Edinburgh

Autonomous Vehicles
Idris Ibrahim
Edinburgh

Traffic Sign Recognition
Idris Ibrahim
Edinburgh

Handwriting Recognition
handwriting recognition image processing machine learning natural language processing digitisation ocr (optical character recognition).
Idris Ibrahim
Edinburgh
The Handwriting Recognition project aims to develop a sophisticated system capable of recognising and converting handwritten text into digital text. This system will involve advanced techniques in image processing, machine learning, and natural language processing to accurately interpret and transcribe handwritten documents.

Sketch to Real using GAN (Generative Adversarial Network)
image-to-image translation sketch to real generative adversarial etworks (gans) realistic image generation lifelike images adversarial training generator discriminator synthetic images realism evaluation dynamic interplay visual content generation.
Idris Ibrahim
Edinburgh
Sketch to Real using GAN is a form of image-to-image translation task where the objective is to generate realistic images from sketches or line drawings. Generative Adversarial Networks (GANs) have demonstrated significant success in this domain and are widely utilised for such purposes. The transformation of sketches into realistic images using GANs falls within the realm of image-to-image translation, with the aim of creating lifelike images from initial sketches or line drawings. GANs, celebrated for their effectiveness in this area, employ a distinctive adversarial training framework involving a generator and discriminator. The generator produces synthetic images, and the discriminator evaluates them for realism. This dynamic interplay facilitates the creation of high-quality, realistic images from simple sketches, exemplifying the power and versatility of GANs in tasks related to visual content generation.

Analysing Deep fake videos and images
digital forensics computer vision machine learning cybersecurity data science social media analysis image and video processing.
Idris Ibrahim
Edinburgh
Examining fake videos and images, commonly known as deep fakes, on social media has become a notable concern. Due to the widespread availability of technology to a broad audience, there has been a noticeable increase in the creation and distribution of deep fake videos across various social media platforms. The term "deep fake" refers to manipulated digital media where an individual's image or video is replaced with the likeness of another person. This trend presents a significant challenge to contemporary society, not only due to the rising prevalence of deep fakes but also because they are used to spread misleading information. Recent research emphasises the broad circulation of deep fake content on social platforms, emphasising the need for urgent development of effective detection methods to tackle this growing problem.

Network Simulator 3 by Example: Bridging the Gap in Networking Education
network simulators ns3 and ns2 teaching website industry trends transition dedicated platform enhanced capabilities f29nc - computer networks and communications f20mx - mobile communications and programming
Idris Ibrahim
Edinburgh
The primary objective of the "Network Simulator 3 by Example" initiative is to modernise and enhance the existing teaching website, currently based on Ns2, in order to align with current industry trends by transitioning to Ns3. This shift includes the creation of a dedicated Ns3 platform with improved capabilities tailored for F29NC - Computer Networks and Communications, as well as F20MX - Mobile Communications and Programming courses. The project emphasises meticulous planning, content development, and a seamless migration process to ensure a seamless transition experience for both students and instructors. The ultimate aim is to establish an invaluable educational platform that not only imparts Ns3 knowledge but also fosters collaborative learning through interactive features and community engagement.

Network Monitoring Tool
network management real-time monitoring traffic analysis bottleneck detection performance metrics alerting system packet capture wireshark network statistics security threat detection bandwidth utilisation packet loss data visualisation
Idris Ibrahim
Edinburgh
A network monitoring tool is a software application or system designed to oversee, analyse, and manage a computer network. It provides insights into the network's performance, identifies potential issues, and helps administrators maintain optimal functionality. This type of tool is essential for organisations to ensure the reliability, security, and efficiency of their networks.

Blind Vision
mobile application computer vision machine learning object detection audio feedback user interface navigation assistance obstacle warning system user testing accessibility standards privacy and security support resources usability testing.
Idris Ibrahim
Edinburgh
Developing Blind Vision is a mobile application aimed at assisting visually impaired individuals in navigating their daily lives. The application utilises computer vision and machine learning to analyse the surroundings captured by the rear camera of a smartphone. By providing real-time audio feedback, turn-by-turn navigation, and warnings about obstacles, Blind Vision empowers users with enhanced independence and safety.

GUI Network Scripts’ Scenarios Generator
ns-2 gui scenario generator networking simulation graphical interface scripting network simulator user-friendly error handling community engagement research papers.
Idris Ibrahim
Edinburgh
The GUI Network Scripts’ Scenarios Generator aims to provide a user-friendly graphical interface for generating scenarios and scripts in NS-2 (Network Simulator 2). NS-2 is a widely used discrete event network simulator for research and educational purposes. This project seeks to simplify the process of creating and customizing simulation scenarios by offering a visual tool that abstracts the complexities of NS-2 script creation.

SDLC Navigator
sdlc software development decision support system agile waterfall scrum spiral project management best practices developer profiling software engineering project success sdlc models decision-making case studies
Idris Ibrahim
Edinburgh
SDLC Navigator: A Comprehensive Study and Decision Support System for Software Development Life Cycle Models In the rapidly evolving landscape of software development, choosing the most suitable SDLC model is crucial for project success. The proposed project aims to provide developers with a comprehensive understanding of various SDLC models, enabling them to make informed decisions tailored to their project requirements.

RoboVision - Object Identification with Robotics
ai programming skills robotics knowledge computer vision machine learning ros (robot operating system) hardware integration problem-solving skills testing and quality assurance.
Idris Ibrahim
Edinburgh
In an era increasingly shaped by automation and robotics, the RoboVision project aims to leverage the capabilities of robotics for real-world applications. The emphasis is on creating a system for object identification using popular robotic platform. The integration of artificial intelligence and robotics will empower the robot to visually perceive and identify various objects in its environment.

Survey on SDN Controllers
sdn controllers network management functionality assessment performance analysis ecosystem integration security mechanisms
Idris Ibrahim
Edinburgh
Software-Defined Networking (SDN) has revolutionised the networking landscape by decoupling the control plane from the data plane, allowing for centralised network management. SDN controllers serve as the brain of SDN architectures, playing a crucial role in orchestrating network resources, optimising performance, and enabling programmability.

Optimising Autonomous Vehicle Coordination through VANET Technology
Idris Ibrahim
Edinburgh
The primary goal is to develop a robust VANET-based framework that enables seamless and efficient interaction between autonomous vehicles. This will address challenges related to real-time data exchange, situational awareness, and cooperative decision-making.

Developing a VANET-Based Safety System for Autonomous Vehicles
Idris Ibrahim
Edinburgh
This project focuses on designing a comprehensive safety system that leverages Vehicular Ad Hoc Networks (VANETs) to deliver early warnings to autonomous vehicles regarding potential hazards and changing road conditions.

Integrating VANETs for Enhanced Delivery Robot Operations
Idris Ibrahim
Edinburgh
This project seeks to investigate the application of Vehicular Ad Hoc Networks (VANETs) to optimise the operation of delivery robots by facilitating their interaction with autonomous vehicles. The goal is to improve delivery efficiency and navigation within smart city ecosystems.

Autonomous Object Recognition and Tracking using Robot Vision
Idris Ibrahim
Edinburgh
The project will focus on developing a vision system capable of identifying objects of interest (e.g., a ball, a person, or an obstacle) and ensuring that the robot can adjust its movement to follow or avoid the object as needed.

A Use Case Assistant
Andrew Ireland
Edinburgh
Build a UML Use Case modelling tool. The tool could be driven from graphical input, i.e. a user draws a use case diagram from which skeleton use case specifications are generated for the user to complete. Alternatively, the modelling could be purely textual with the use case diagram being generated automatically. Either way, consistency checking would be an important capability of the tool, i.e. ensuring consistency between specifications as well as the diagram.

Tool Support for Anticipating Accidents
Andrew Ireland
Edinburgh
STPA -- System Theoretic Process Analysis -- is a leading Hazard Analysis technique. Developed at MIT, STPA is used for analysing hazards within the context of safety-critical systems, e.g. transportation, medical and national infrastructure. By assisting in the identification of system-level hazards, STPA supports the development of control actions that can be used to prevent potential accidents.

Use Cases and Problem Frames
Andrew Ireland
Edinburgh
Problem frames and use case modelling are complementary approaches to requirements engineering. The aim of this project to investigate and develop an approach that integrates both techniques.

Neuro-symbolic Requirements Engineering
Andrew Ireland
Edinburgh
Problem frames is a technique that uses common patterns that occur within problems to capture requirements within the context of software engineering. A feature of the problem frames approach is that it draws a distinction between system requirements (world view) and software requirements (software view). The consistency between the world and software views can then be verified mechanically. Note that inconsistencies at this level have historically led to catastrophic system failures. Consistency verification will typically rely upon assumptions about the 'world' in which the intended system will operate. Problems frames does not help with ensuring the validating of such assumptions. This is where LLMs might help, i.e., we propose to use LLMs to explain the validity (or otherwise) of a given set of assumptions. Ultimately, it will be the responsibility of a human engineer to decide if they believe the LLM's explanations. But a tool that combines symbolic 'consistency verification' with neuro (i.e., LLM) explanations of 'assumption validating' could increase the productivity of an engineer. It would be a cool tool too!

Projects in cyber security
authentication cryptography cyber security network security
Mike Just
Edinburgh
I'm happy to supervise students who want to do a project in cyber security, particularly in areas of application design, authentication, cryptography, digital identity, network security or privacy.

Cryptography Guessing Game
cryptography game
Mike Just
Edinburgh
Your task is to build a game that tests users' abilities with cryptography. For different ciphers, a user will be presented with a ciphertext and then asked to answer with the correct plaintext. You might provide a multiple list of potential plaintexts (where at most one is correct) and/or other hints to the user. The user will gain points based on finding each plaintext (possibly some points for partial finds) as well as the difficulty level that might be based on such factors as the type of cipher or difficulty on multiple choice selections. As a start, your game should allow one person to play on their own (against the computer) and you can also look at variations where one or more users can play against one another in competition. You might also investigate collaborative options for the game whereby two or more users work together to guess the correct plaintext.

Cracking some historical ciphertexts
cybersecurity cryptography
Mike Just
Edinburgh
There are plenty of tools for using historical ciphers to encrypt plaintext to ciphertext, but very few to do the reverse and cryptanalyse ciphertext. Your aim is to build an application that takes a ciphertext as input (and possibly some partial plaintext or other information, such as the language of the plaintext), (optionally) the name of the cipher used to encrypt it, and produces one or more possible plaintexts (without knowledge of the key). This project will be a useful exercise for understanding how different ciphers work and the approaches used to cracking them. There are some interesting challenges related to automatically detecting whether you have recovered a valid plaintext.

Machine learning for cyber security (MEng masterclass)
machine learning cyber security anomaly detection
Mike Just
Edinburgh
There are many situations today in which cyber security decisions are made using machine learning. Most commonly it has been used for detecting anomalies, such as intrusion detection (of network traffic) and malware detection (of software), though there are other areas as well (see Section 4 of https://www.cybok.org/media/downloads/AI_for_Security_TG_v1.0.0.pdf). This is a proposal for a MEng masterclass, output of which would involve the delivery of a lecture, and possibly tutorial or lab, on how machine learning is used for cyber security, with a focus on one security application area.

Digging Deep with Digital Forensics
cyber security systems security digital forensics
Mike Just
Edinburgh
Digital forensics (DF) tools are used to recover data from physical devices such as hard drives and smartphones. By doing the recovery at a lower systems level, the tools can recover data that might not be visible through higher level applications. For example, data that might have been "deleted" by an application may still be recoverable (sometimes only partially) by a DF tool. Such tools are important for various applied areas, including criminal investigations. Thus, the recovery of such data may be relied upon as evidence to show that a crime occurred. With such potentially important consequences (e.g., whether or not someone goes to jail), the accuracy and consistency of the data recover is important. However, research indicates that under certain situations different DF tools can produce different results. Your goal in this project is to test the reliability of a set of DF tools under a variety of different conditions (e.g., different data types, storage drives, file allocation methods, etc.) in order to demonstrate if and when two tools might produce inconsistent results. For this project, you'll get to learn how different systems manage data, and how different tools recover from these different systems and present the results for the DF examiner. You'll also learn to construct a rigorous experiment and data recovery environment and demonstrate an appropriate "chain of evidence" that would be acceptable as evidence in legal proceedings. A starting publication to learn about some of the challenges in this area: Horsman, G., 2019. Tool testing and reliability issues in the field of digital forensics. Digital Investigation, 28, pp.163-175.

Avoiding the dreaded password stuffing
authentication cryptography cyber security passwords
Mike Just
Edinburgh
Password (or more generally, credential) stuffing is a cyber attack in which an attacker takes a user's password that was compromised from one site to break into another account belonging to the same user. Such attacks are possible since users often re-use their passwords at multiple sites. One way to prevent such attacks is to check whether a password has been compromised. Such a service would take a password (and user ID) as input, and search for the password on various compromised password lists. If the password is present on a list, then it is known to be compromised and should no longer be used. The risk with such "password checking services" is that they need to be trusted, else they might themselves be a nefarious "password harvesting service" that is stealing user passwords. To avoid have to trust the checking service, rather than providing the full password "in the clear", other information can be provided, such as a partial hash of the password. Your task is to design and implement a password checking service, and test the service using a variety of methods and databases according to different metrics such as security, efficiency, etc. Some related papers here (the 2nd relates to Google's password checkup service): Li, L., Pal, B., Ali, J., Sullivan, N., Chatterjee, R. and Ristenpart, T., 2019, November. Protocols for checking compromised credentials. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 1387-1403). Thomas, K., Pullman, J., Yeo, K., Raghunathan, A., Kelley, P.G., Invernizzi, L., Benko, B., Pietraszek, T., Patel, S., Boneh, D. and Bursztein, E., 2019. Protecting accounts from credential stuffing with password breach alerting. In 28th USENIX Security Symposium (USENIX Security 19) (pp. 1556-1571).

A visualiser for reductions in the lambda calculus
Fairouz Kamareddine
Edinburgh
The lambda calculus is an idealised programming language. Reductions in the lambda calculus allowed us to study evaluation strategies in programming languages. This project is to visualise reductions (best using Python), and to assess and compare different strategies and tradeoffs between termination and efficiency. The visualised reductions will be animated graphs that almost speak to the user. Some of these graphs will be impressive. You can demonstrate the usefulness of what you do either for educational purposes, or for measuring the efficiency of different programs.

Analysing Mathematical texts in MathLang
Fairouz Kamareddine
Edinburgh
Readapt MathLang in Python and use the adaptation to MathLang to analyze mathematical texts.

Associating programming langauges to tasks
Fairouz Kamareddine
Edinburgh
Different programming languages have different features. This project is to assess when you need an object oriented language, when you need a parallel language, and to build a tool that helps this choice. Already a number of tools have been built and this project is to extend them with new features.

Checking the correctness of a mathematical book in Lean/Coq/Isabelle
Fairouz Kamareddine
Edinburgh
Computer checking the correctness of entire books of mathematics has moved on since the ideas were first introduced independently by de Bruijn in his Automath system and Trybulec in his Mizar system. Since, Computer systems have been created to check the correctness of not only mathematical books but of software and information in general. In this project you will investigate the checking of a a mathematical book in a choice of three computer checkers: Isabelle, Coq or Lean. You will first need to familiarise yourself with one or all of these systems after you have done the background research on them, and then you will need to investigate initial proving of a couple of results you have studied in one of your favorite courses (e.g., it could be your course on discrete mathematics or logic and proof or lambda calculus or Turing machines), and then decide whether you want to do all the work in one prover or you want to compare the three provers in question.

Investigating the Languages for Mathematics
Fairouz Kamareddine
Edinburgh
Langauges of Mathematics have played an important role in the creation of the computability theory and also new areas such as the design and verification of programming langauges. There is still no ideal langauge to write mathematics. This project investigates a number of attempts at finding a good language to write mathematics and to also computerise and check the correctness of mathematics.

Investigations (theory and implementation) of a new computation language
Fairouz Kamareddine
Edinburgh
The famous BNF notation as introduced and used in the Algol 60 report was followed by numerous notational variants (EBNF, ABNF, RBNF etc.). Subsequently, a new Computer Science Metanonation "CSM" was developed. But, CSM does not have yet a full definition or a full implementation. This may be difficult to do. This project will investigate variants of CSM, extracting a full definition and giving a full implementation.

Proving correctness of Turing machine specifications
Fairouz Kamareddine
Edinburgh
You learned how to write and implement Turing machines to solve certain tasks. But, how do you ensure that the machine is correct and meets its specification? How do you show the properties of your Turing machine (e.g., whether it terminates). This project is to implement the language of Turing machines in Lean (or your choice of provers) and to check the properties of different Turing machines as well as the universal machine.

Semantic annotations of Foundations course material
latex semantically annotated text parser python
Fairouz Kamareddine
Edinburgh
Semantically annotated mathematical documents provide several advantages over unannotated documents. In LaTeX, a package for adding semantic annotations called sTeX can be used to semantically annotate documents. This project would involve using sTeX to create semantic annotations for the Foundations course material (slides, tutorial sheets, etc.). Semantic annotating of documents provides plenty of opportunity for automation, so part of the project would include interaction with existing automation approaches to provide annotations for the Foundations course material. This project would require some experience with LaTeX, Python, and parsing.

Network Intrusion Detection System using Netflow Data and Machine Learning
nids network intrusion detection machine learning
Kayvan Karim
Dubai
The project applies machine learning algorithms to analyze the attack patterns within Netflow data, distinguishing benign network behaviour from potential cyber threats. The Netflow data can be aggregated using the NTFA tool and then build machine learning models to detect the attacks.

Retrieval Augmented Generation (RAG) using LLM
llm vector db
Kayvan Karim
Dubai
For this project, the student must utilize LLM and Vector DB tools like Pinecone or ChromaDb to create an LLM agent to respond to user inquiries based on the provided document(s) using Retrieval Augmented Generation.

Explainable AI for Verification
Ekaterina Komendantskaya
Edinburgh
As machine learning algorithms find their ways in safety-critical systems, such as autonomous cars, robot nurses, conversational agents, the question of ensuring their safety and security becomes important. It is often difficult to explain and interpret machine learning models, yet this seems to be the key to their safe and secure implementation and applications. Explainability of AI is a growing subject area, with different applications, with several tools already available on the market, such as e.g. LIME. These tools start to gradually find their place in AI verification. You will study applications of methods of explainable AI in the area of Verification, and will create your own prototype tools that support these methods. You will have a chance to collaborate with researchers in the lab for AI and Verification: LAIV.uk.

Probabilistic Verification of AI
Ekaterina Komendantskaya
Edinburgh
As machine learning algorithms find their ways in safety-critical systems, such as autonomous cars, robot nurses, conversational agents, the question of ensuring their safety and security becomes important. At the same time, neural networks are known to be vulnerable to adversarial attacks --- a special kind of crafted inputs that cause unintended behaviour in trained neural networks. Due to these two factors, neural network verification has become a hot topic in both machine learning and verification communities. It is often described as one of the main challenges faced by computer science and engineering these days. Often, we cannot verify a property with certainty, but can verify it with some degree of probability. There are languages ad tools for Probabilistic verification, for example, PRISM or Probabilistic Prolog. You will study this area and implement your own toy examples using these methods. You will have a chance to collaborate with researchers in the lab for AI and Verification: LAIV.uk.

Verification of Neural Networks
Ekaterina Komendantskaya
Edinburgh
As machine learning algorithms find their ways in safety-critical systems, such as autonomous cars, robot nurses, conversational agents, the question of ensuring their safety and security becomes important. At the same time, neural networks are known to be vulnerable to adversarial attacks --- a special kind of crafted inputs that cause unintended behaviour in trained neural networks. Due to these two factors, neural network verification has become a hot topic in both machine learning and verification communities. It is often described as one of the main challenges faced by computer science and engineering these days. In this project, you will study the existing methods of neural network verification, and will implement your own toy application/algorithm. You will have a chance to collaborate with researchers in the lab for AI and Verification: LAIV.uk.

NLP - Text Summarisation using Large Language Models
machine learning neural networks large language models natural language processing (nlp)
Ioannis Konstas
Edinburgh
You have most likely already used ChatGPT a few times (or a lot!) Have you ever wondered what it actually takes to build a system based on a Large Language Model (LLM) and evaluate it on a real-world task? In this series of projects (check the rest as well!) we will explore the task of text summarisation. This involves taking a long textual input (it could be a news article, notes, or minutes from an interaction like a meeting, or a dialogue, and converting it into a smaller concise document, or set of bullet points. The important thing is to make sure that the output summary faithfully corresponds to the input while mentioning the most salient/noteworthy points. The idea is to explore several popular techniques for fine-tuning an open-source LLM (e.g., Llama 2) starting from the simpler ones (prompt engineering), all the way up to Parameter Efficient Fine-Tuning (PEFT). We will use standard benchmark datasets and SOTA frameworks to evaluate (and potentially train) our models. We can co-develop the project to emphasise more on the features (conciseness, faithfulness, salience), training, data annotation, or human evaluation.

NLP - Self-correcting Language Modelling using Large Language Models
machine learning neural networks large language models natural language processing (nlp)
Ioannis Konstas
Edinburgh
You have most likely already used ChatGPT a few times (or a lot!) Have you ever wondered what it actually takes to build a system based on a Large Language Model (LLM) and evaluate it on a real-world task? In this series of projects (check the rest as well!) we will explore the task of self-critique, or providing feedback using natural language to improve the performance on a downstream task. Take the following example (from [1]): User: I am interested in playing Table tennis. Response: I'm sure it's a great way to socialize, stay active Feedback: Engaging: Provides no information about table tennis or how to play it. User understanding: Lacks understanding of user's needs and state of mind. Response (refined): That's great to hear (...) ! It's a fun sport requiring quick reflexes and good hand-eye coordination. Have you played before, or are you looking to learn? [1] Madaan et al. SELF-REFINE: Iterative Refinement with Self-Feedback. 2023. arXiv. This notion of using language to update a model has received a lot of interest recently which allows for many interesting avenues to pursue: - Come up with a generic style of self-feedback that works for many different tasks (e.g., dialogue response generation, code generation, question answering, error correction, etc). - Focus on one task and create/collect a dataset of feedback that contains desired measurable properties that can be evaluated. In other words we will attempt to evaluate the feedback provided itself rather than just the refined responses. - (More challenging) use self-feedback to train a reward model using reinforcement learning (https://github.com/huggingface/trl) Once we decide on the particular flavour of QA task we are interested in, then we can explore several popular techniques for fine-tuning an open-source LLM (e.g., Llama 2) starting from the simpler ones (prompt engineering), all the way up to Parameter Efficient Fine-Tuning (PEFT). We will use standard benchmark datasets and SOTA frameworks to evaluate (and potentially train) our models. We can co-develop the project to emphasise more on the style of feedback training, data annotation, or human evaluation.

NLP - Question Answering using Large Language Models
machine learning neural networks large language models natural language processing (nlp)
Ioannis Konstas
Edinburgh
You have most likely already used ChatGPT a few times (or a lot!) Have you ever wondered what it actually takes to build a system based on a Large Language Model (LLM) and evaluate it on a real-world task? In this series of projects (check the rest as well!) we will explore the task of question answering (QA). QA involves trying to automatically answer a user query usually "grounded" on one or more passages (what we refer to as open-book QA) or just relying on the acquired knowledge of the model (closed-book QA). The tricky part (especially with the recent advances in LLMs) is to ensure that the output question is faithful to the user query (and the input passage, if that exists) and not to hallucinate extra pieces of irrelevant or wrong information. There are many interesting aspects in the field of Question Answering that we could potentially explore: - Choose between open-book and closed-book QA. The former involves the component of retrieving the relevant information; this could be either via a search engine (e.g., via a Google search query), or a retrieval engine. This usually depends on the breadth of the domain we will choose to focus on (generic knowledge QA vs. closed-domain questions based for example on the scrape of a website only) - Experiment with different styles of outputs: Provide answers that are less or more verbose, or that exhibit particular linguistic or other rhetoric phenomena (such as repeating part of the question, providing a definition first, giving an explanation, etc.) - Explore more challenging queries, that might require some sort of decomposition reasoning, e.g., complex questions: "How old was Linus Torvalds when Linux 1.0 was released?", "What is the difference between jam and marmalade?" - Experiment with more than single-shot questions, resembling a natural conversation. Once we decide on the particular flavour of QA task we are interested in, then we can explore several popular techniques for fine-tuning an open-source LLM (e.g., Llama 2) starting from the simpler ones (prompt engineering), all the way up to Parameter Efficient Fine-Tuning (PEFT). We will use standard benchmark datasets and SOTA frameworks to evaluate (and potentially train) our models. We can co-develop the project to emphasise more on the features (faithfulness, style of output, complex question, conversational QA), training, data annotation, or human evaluation.

Low-cost robotic arm manipulation
robotic arm machine learning robotics
Ioannis Konstas
Edinburgh
Would you fancy playing with a 3D-printed low-cost robotic 6 DoF follower arm? We have built a small follower robotic arm with a simple actuator and its leader counterpart that can be used to teleoperate and/or train the leader to perform manipulation tasks [1]. Each control radius and payload weight are quite small, but its fairly low cost means that you can experiment quite a bit :) The topic is left quite open-ended as we are open to suggestions for topics ranging from more straightforward teleoperation-based training using models such as ACT [2] based on modern Python frameworks such as lerobot by Huggingface [3], to more Human-robot interaction projects such as teleoperating from a distance and researching the effect of latency on human performance of certain tasks, or even teaching the robot to play 'Where is Waldo' [5]! Finally, if you are up for it, you can also experiment with training with its simulated digital twin, first existing in MuJoCo [4] and then deploying it to the real robot. [1] https://github.com/AlexanderKoch-Koch/low_cost_robot [2] https://tonyzhaozh.github.io/aloha/ [3] https://github.com/huggingface/lerobot [4] https://mujoco.org/ [5] https://www.youtube.com/watch?v=-i7HMPpxB-Y

AI & Software Engineering
Smitha Kumar
Dubai
Spectrum-Based Fault Localization(SBFL) aims to identify and rank the program code elements that have caused the test cases to fails. There are multiple approaches in SBFL and this project investigates the effectiveness of each and also develops a tool. Dataset is available for this project

Data Mining Projects
Smitha Kumar
Dubai
Projects in this area would include data mining techniques applied to the field of education, bioinformatics, social media data OR any application area of your choice.

Facial Expression Recognition using CNN
Smitha Kumar
Dubai
Mobile Application which could capture an image an d analyze facial expression and identify the emotion. Using CNN(Convolution Neural Network) https://www.sciencedirect.com/science/article/pii/S0031320316301753

Machine Learning Applications
Smitha Kumar
Dubai
Applications of Machine Learning in healthcare/education/speaker recognition/food image recognition

Machine learning based Bug Triaging
Smitha Kumar
Dubai
This project is to use efficient Machine learning techniques to perform Bug Triaging. Triaging of bug is to categorize a reported bug to the right developer/severity/category.

Sentiment Analysis in software engineering
Smitha Kumar
Dubai
This project is basically to identify developer emotions in a collaborative environment/analysis of existing sentiment analysis tools/

IoT based Aquaculture monitoring system
hardware (sensors) arduino programming mobile application development
Rosalind Deena Kumari
Malaysia
Fish farming is a tedious process which requires continuous monitoring. The system here will be developed to monitor the conditions of the water - water level, PH etc using sensors. This information is stored in a cloud using a microcontroller . The information is monitored so that any change in ideal conditions will alert the farm administrator and action can be taken. Here, knowledge of use of hardware, sensors, programming of the microcontroller is needed. A web app is built to monitor and control the system.

Smart Irrigation system - A Web based application
Rosalind Deena Kumari
Malaysia
This project involves the development of a smart irrigation system using sensors. The amount of water that is supplied to the agricultural land is important to be monitored or else it would affect the crops grown. The soil condition is monitored using various sensors. The control is done using a microcontroller. This data from the sensors is fed to a database. A web interface and a mobile app is used to read the information and control the amount of water being supplied. The data collected during different months can be used for prediction.

IoT based road safety measure using mobile app
assembly level coding basic electronics hardware software interfacing
Rosalind Deena Kumari
Malaysia
This project involves the design and development of a road safety detection system. A lux meter is designed to measure the illumination of light on sites and roads. The intensity of the light is fed back to the detection system and the conditions for safety is compared with the intensity measured. This enables to ensure the road is safe to use. The measurements will be stored and compared against a control number (of light intensity) to determine if the conditions are safe or not. Hardware and sensors will be used to construct the lux meter and programmed to compare results and control. Arduino will be used to develop the lux meter.

Generative AI, Spoken Language Technology, NLP (various topics)
ai generative ai llm speech conversation robotics human-robot interaction nlp evaluation safety
Oliver Lemon
Edinburgh
Conversational AI and spoken language technology is a lively research area encompassing a variety of problems such as speech recognition, language understanding, dialogue management, user modelling, language generation, and Natural Language Processing (NLP) in general. Since the arrival of ChatGPT and similar models, many conversational systems are now built using generative AI methods such as LLMs. For this project you can choose from a variety of topics listed below or propose your own. All must involve an evaluation of what you have achieved (e.g. compared to previous state-of-the-art approaches / systems/ components). - Integrating conversational AI and SDS with graphical talking head / Virtual Characters. e.g. use the FurHat robot head, ARI robot. You may wish to use generative AI models such as GPT, LLAMA etc - using generative AI models in video games (e.g. NPCs, player assistant, visually-aware companion....) - Integrating conversational AI and SDS with vision systems. You may wish to use generative AI vision-and-language models such as LLAVA - Safety and ethical issues of generative AI models - Task-based conversational systems - Your own topic that involves state-of-the-art research in talking to computers. E.g. Personalised or Emotional Natural Language Generation, Machine Learning methods for optimization of NLP, etc. Make sure that there is a method for evaluating your proposed advances! You will need to take course F20CA / F21CA to do this project.

Conversational AI for Blind and Partially Sighted users
generative ai computer vision nlp assistive technology
Oliver Lemon
Edinburgh
You will use a generative AI vision-and-language model such as LLAVA to create a spoken dialogue system which blind/partially sighted people can use to find out about the objects and places near them. For example "Is there somewhere I can sit down?" - "Yes, there is a chair on your left, about 2 metres away" {user moves} "Where is it now?" "The chair is right ahead of you, about 50 centimetres away" You will evaluate the system that you create by using images on a screen that sighted users can ask questions about. You will need to take course F20CA / F21CA to do this project.

Social robotics: HRI for multi-person conversation
generative ai llm human-robot interaction evaluation
Oliver Lemon
Edinburgh
This project will develop a new conversational interface for a robot receptionist for the Robotarium building -- using the robot animated head FurHat: http://www.speech.kth.se/furhat/ which we have in the lab, or using the PAL ARI robot. A key aspect will be to handle multiple people in the conversation (e.g. 2 groups of 2 visitors, one group wants to find the cafe, the other has a meeting in room 3). The project combines audio and visual processing to create a socially intelligent robot. Using e.g. the Furhat SDK and an LLM or VLM, you will develop and test a new interface that performs socially intelligent human-robot interaction for greeting visitors to a building, allowing them to find their meeting location, alerting their contacts, handling deliveries, etc. You will evaluate the system you create with real users. If you use the PAL ARI robot, you could attempt to navigate the robot to lead the human to the right location. This project should handle dialogue with multiple humans in the scene. You should take F21CA Conversational Agents if you do this project.

Speech Interfaces for Role-playing games
ai games generative ai llms speech evaluation
Oliver Lemon
Edinburgh
There are interesting opportunities in adding speech and dialogue capabilities into video games -- for example making game characters that you can really talk to. New LLMs (large language models) such as GPT 4 are now being used to create games and conversational game characters, see https://en.wikipedia.org/wiki/AI_Dungeon This project could explore such systems and LLMs and integrate them a into a game engine. For example to drive conversations with NPCs, or develop a visually-aware conversational game companion, search lore etc. See e.g. https://www.youtube.com/watch?v=DRVCkUN_Mq8 for inspiration See also https://www.youtube.com/watch?v=rlZfTHyu5wI&t=34s You will evaluate the usability and entertainment value of this system versus the baseline game. See e.g. http://www.voiceattack.com/ This project can be done in collaboration with an industrial partner, for example SpeechGraphics, and is also suitable for DataLab students. See https://arxiv.org/html/2402.18659v1 for a recent survey of this area

Collaborative AI: building AI systems capable of teamwork
ai teamwork generative ai
Oliver Lemon
Edinburgh
Consider the different scenarios where AI agents need to collaborate with unfamiliar teammates (other robots, AI systems, and humans) who possess varying knowledge, skills, and capabilities. This is the problem of `ad-hoc teamwork' (AHT), which requires agents with the ability to dynamically agree and coordinate on a `common-ground' understanding of the domain and tasks at hand. You will investigate the extent to which current generative AI systems (LLMs and VLMs) have such collaborative skills, and develop new methods to support AHT within generative AI systems. You will investigate tools such as AutoGen ( https://microsoft.github.io/autogen/ ) and LangChain Agents

Human-Robot Teamwork with Generative AI
generative ai llm human-robot interaction evaluation
Oliver Lemon
Edinburgh
This project will explore how humans, robots, and AIs can collaborate together in teams, to coordinate on shared tasks. This will involve generating conversational interaction to understand tasks, agree plans, resolve ambiguities, correct mistakes, and so on. You will use LLMs and/or VLMs (such as LLAMA, LLAVA, GPT, Gemini, Moshi, etc ...) to create a system which can meet and coordinate with a previously unseen other agent (a human or robot) and collaborate with them to complete a shared task -- for example to tidy up a room, make breakfast, or build a lego model. You will use real robots such as Tiago, ARI, Stretch, Furhat and/or simulations of them. You will evaluate the system's effectiveness and efficiency in completing different shared tasks with different people.

Formal specifications and verification of software requirements
formal verification model-checking automatic theorem-proving
Oleksandr Letychevskyi
Edinburgh
Using the example of requirements for software, create its formal specification and apply formal verification methods, such as model checking or automatic theorem proving. Identify the properties of the program that need to be proved or disproved.

Model-based testing of software systems.
model-based testing test coverage software formal model
Oleksandr Letychevskyi
Edinburgh
Write a program that generates test sequences based on a formal model of the system under test. Investigate the test coverage conditions of the model and conduct an experiment with the aim of the best coverage of the code.

Using AI methods in theorem proving
automatic theorem proving ai methods term rewriting system
Oleksandr Letychevskyi
Edinburgh
Consider examples of the use of AI in competitions of proof machines. Create a neural network trained on the scenarios of proving statements in the selected theory (polynomial algebra, first-order logic). Using an existing statement proof system or term rewriting system, integrate it with a neural network that provides a hint for each step of the proof.

Problems of simplifying Boolean formulas and AI
first-order logic algebraic expression simplifying ai methods
Oleksandr Letychevskyi
Edinburgh
On the basis of axioms and theorems of predicate logic (propositional or first-order) within the framework of the existing system of term rewriting, create an inference machine. Based on the output scenarios, create a neural network that determines the most efficient simplification step. The result could be an integration of a neural network and term rewriting system, or some other way of hinting at the simplification step.

Neuro-symbolic approach in finding backdoors in the program code.
backdoor software vulnerability neuro-symbolic approach
Oleksandr Letychevskyi
Edinburgh
Investigate the types of backdoors in the program code and create several examples on which to simulate the behavior of the program. Based on the received scenarios, create a neural network for detecting backdoors. In parallel, create a formal description of the semantics of the program's behavior with a backdoor. Create a method that uses the formal semantics of backdoors, and confirms or rejects the result of neural network classification.

Research on the use of formal methods in the verification of the property of resistance to attacks in the blockchain.
blockchain formal verification double spending attack
Oleksandr Letychevskyi
Edinburgh
Create a formal model of the selected consensus algorithm (Proof of stake, proof of delegated stake, etc.) in the form of a formal model, which can be represented as an automaton, a Petri net, or another type of transition system or process algebra. Consider attacks on the blockchain such as double spending, Sybil attacks, or others. Create a method or use existing algorithms of the model-checking and prove the possibility of an attack under certain conditions (for example, the number of attackers in the network is more than 50%).

AI methods in detecting intrusions into software systems.
cybersecurity ai methods adversarial attacks
Oleksandr Letychevskyi
Edinburgh
Create a neural network that classifies cyberattacks based on existing datasets. Create a program (wrapper) that works as a firewall for a software system to detect and prevent intrusions, using the created neural network. Estimate accuracy and adversarial attack resistance. Consider the problem of verification of the neuron network.

AI methods in biological research.
ai methods biological models
Oleksandr Letychevskyi
Edinburgh
Consider neural networks that work with Big Data in biological research (a database of experiments, substances with certain properties). Develop a technology for searching for a substance with given properties using AI methods.

A tool to monitor/inspect multi-threaded (concurrent) Java software
programming java concurrency static analysis runtime monitoring software engineering
Kostas Liaskos
Edinburgh
Writing correct Java multi-threaded code [1] is a hard problem, which introduces new challenges and common concurrency bug patterns (e.g. memory consistency errors [1]). The programmer has to ensure that all accesses to shared data are coordinated. The coordination is usually done with some sort of synchronisation, which in turn might lead to further problems (e.g. deadlocks, starvation and livelock [1]). The utilisation of supporting software monitoring/inspection tools is an important countermeasure for programmers to identify such concurrency issues in programs. The aim of this project is to develop a tool that monitors/inspects multi-threaded (concurrent) Java systems. The project is open-ended in terms of the techniques that can be utilised, and a few indicative examples are: - Techniques for static program analysis [2], e.g. software code quality metrics ([3]): [4] and [5] are examples of two popular metrics for non-concurrent systems; or - Runtime system monitoring techniques [6], e.g. the native task managers [7] of Microsoft Windows and macOS are two popular examples; or - A combination of the above. The target users will be other programmers, software engineers, and testers. Requirements gathering will mainly involve researching relevant literature (including learning the basics of Java concurrency and multi-threaded programming) and existing tools in order to adapt the selected techniques for concurrent systems. GUI implementation will be a “must-have” requirement. Some students may choose to investigate turning their tool into a plug-in for a popular Java IDE, e.g. Eclipse, IntelliJ etc.

A tool to support independent/self-directed learning within programming/computer science
computer science education independent learning self-directed learning programming visualisation of data structures visualisation of algorithms
Kostas Liaskos
Edinburgh
The project is open-ended in that you may choose the specific field(s) within programming/computer science. One example in the context of learning programming is data structure/algorithm visualisation [1]. Another popular (non- programming/computer science specific) tool is Duolingo in the context of learning foreign languages [2]. The end-product must include functionality on the following aspects [3]: - Assess readiness to learn; - Set learning goals; - Engage in the learning process; and - Evaluate learning. The target users will be learners and instructors within the field of programming/computer science. Requirements gathering and evaluation must involve users from this target audience. GUI implementation will be a “must-have” requirement.

Data-flow code coverage of unit tests: traditional vs. concurrent
java concurrency static analysis runtime analysis programming software engineering code coverage data flow analysis unit testing junit
Kostas Liaskos
Edinburgh
Data-flow coverage is a coverage metric that has been utilised successfully to measure the adequacy of test suites. Concurrency introduces new challenges in the context of software testing; hence, a variation of the traditional data-flow coverage metric has been proposed [1]. The aim of this project is to develop a tool that calculates both versions of data-flow coverage. The core of this project is to compare the two versions and further investigate the usefulness of the concurrent version. The key tasks and challenges in this project include: 1. Reviewing data-flow coverage metrics and understanding the challenges introduced by concurrency. 2. Investigation and implementation of both versions of data-flow coverage. 3. Evaluating the tool by applying it to a range concurrent systems. Strong Java programming skills are essential. Furthermore, the student must familiarise with the fundamentals of the Java concurrency package [2].

Automated mutation testing for concurrent software
java concurrency mutation testing mutation operators code parsing static code analysis programming software engineering unit testing junit
Kostas Liaskos
Edinburgh
Mutation testing is a simple technique utilised to evaluate the quality of existing software tests: faults (or mutations) are seeded into the code, then the tests are run. The quality of the tests can be calculated from the percentage of mutations killed. Tools that automate this process exist, e.g. PIT [1]. The aim of this project is to develop a tool that automates mutation testing in the context of concurrent software. The core of this project is to investigate common concurrent bug patterns [2] and suggest appropriate mutation operators. The key tasks and challenges in this project include: 1. Reviewing common concurrent bug patterns and suggest appropriate mutation operators for mutation testing. 2. Investigation and implementation of the suggested mutation operators. 3. Evaluating the tool by applying it to a range concurrent systems.

Automated code quality metrics for concurrent software
java concurrency software quality static analysis programming software engineering code metrics
Kostas Liaskos
Edinburgh
Code quality metrics (e.g. lines of code (LOC) [1], cyclomatic complexity [2] etc.) are extensively utilised in the context of Software Quality Assurance. However, concurrency introduces new challenges in terms of the adequacy of such metrics. The aim of this project is to develop a tool that automates code quality metrics in the context of concurrent software. The core of this project is to investigate common code quality metrics utilised for non-concurrent software and suggest appropriate metrics for concurrent systems. The key tasks and challenges in this project include: 1. Reviewing common code quality metrics utilised for non-concurrent software and suggest appropriate metrics for concurrent software. 2. Investigation and implementation of the suggested metrics for concurrent software. 3. Evaluating the tool by applying it to a range concurrent systems.

An Abstract Cyber Security Strategy Game
cyber-security games
Hans Wolfgang Loidl
Edinburgh
Cyber security is currently one of the main priority areas of the UK government with its UK Cyber Security Strategy [2] outlining the Government’s plans for ensuring a secure and prosperous cyberspace for UK citizens and businesses. A recent boardgame, designed by Andreas Haggman, aims to model, on an abstract level, the threats and vulnerabilities of the UK and other technology-focused states in the context of on-line business and government. The main aim of this project is to implement Haggman's design for a cyber strategy game on a mobile deviced, preferably Android, and to evaluate the game by running a gaming session with testers and evaluate their experience as well as the game dynamics. The project will proceed in the following stages: - Literature review on cyber security, board game design, and programming mobile devices. - An initial implementation of the game as a multi-player game on mobile devices. - Test and evaluation of the initial implementation, identifying areas for improvement. - A revised implementation based on the above - Test and evaluation of the revised implementation, in terms of software quality and gaming experience.

Continuous Compliance Validation Pipes for Autonomous Vehicles Safety Cases Using Bazel
fintech
Hans Wolfgang Loidl
Edinburgh
In safety-critical systems, the validation stage–that is to say a retroactive analysis of the software to ensure the requirements are met–is time-consuming and expensive. Standards defined in various industries such as Automotive (ISO26262), Industrial (IEC61508), Robotics (IEC61508), Medical Devices (IEC62304), and Avionics (DO-178) all have stipulations for how software is treated and validated before it is safe to use. As expected there are many concepts that are common across the board which can be simplified with software. Bazel is a build tool developed by Google that promises to build software quickly, reproducibly, and correctly, guaranteeing that the same input produces the same output now and forever. The applications to building safety-critical software are obvious. In the case of the automotive industry, the rise of self driving cars has put new pressures on software validation. Project complexity is increasing as the expectations around safety grow. A new approach from MIT named Systems-Theoretic Process Analysis presents a method for validating complex projects by analysing the control structures and working back from an accident scenario to provide traceability into the complex set of preconditions that triggered specific unsafe control actions to occur, and is of particular use in the autonomous car industry due to the complexity of the software. Integrating reproducible builds and STPA validation into the software of the car’s control system will make compliance validation cheaper, more secure, and less time-consuming. Objectives: 1. Make Baidu Apollo’s build process reproducible using Bazel 2. Integrate STPA validation into the project’s tests 3. Establish a continuous integration pipeline to run this validation Further Reading: <A href="http://psas.scripts.mit.edu/home/wp-content/uploads/2013/04/Basic_STPA_Tutorial1.pdf">Basic STPA Tutorial</a> <a href="https://bazel.build/">Bazel</a> <a href="http://apollo.auto/">Baidu Apollo</a>

Develop an AI agent for a multi-player on-line historical role-playing game
games ai
Hans Wolfgang Loidl
Edinburgh
Role-playing games, set in an accurate historical context and supported by a scalable, distributed game engine, can provide an engaging learning environment for both players and game developers: players can learn about the historical and societal context of the game, and game developers can exercise modular design of a complex system in order to achieve scalability for a large number of players. The goal of this project is to develop an AI agent that can act as a NPC or a PC in the previously developed core game engine (JominiEngine [1]). This involves interacting with the game engine, through the same kind of API that is used for the separately developed game clients. The AI agent should be able to interact with the game world and perform basic activites in the three main areas of fief management, household management, and army management. The agent can initially be simple, and rule based, but should be extended to a version that draws on machine-learning techniques to demonstrate increased effectiveness in the game. The project will proceed in the following phases. Literature survey on game design, machine learning and AI techniques; review of the existing game model and code base Design of basic AI functionality (e.g. rule based) and its interaction with the game model Implementation of the basic AI functionality Design of improved AI functionality, drawing on machine-learning techniques Implementation of improved AI functionality Evaluation of the effectiveness of the improved vs. the basic functionality

Efficient stream processing and machine learning for stock market data (industry project)
fintech industry project
Hans Wolfgang Loidl
Edinburgh
High volumes of data that are generated continuously are a challenge for many application domains. Efficient implementation of a streaming pipeline is crucial to make processing feasible, and opens the opportunity for applying machine learning techniques on the data stream. The goal in this project is to evaluate and extend an existing system for high-performance stream processing, and to then use simple machine learning techniques on the data stream. The main platform is a recently developed, open source library [1], [2]. In the first step a systematic evaluation of performance and throughput of this library, in comparison with alternatives such as Apache-Kafka [3] should be performed. Possible extensions and enhancements to the performance of the library should be considered. In the second step, simple machine learning techniques should be employed to learn characteristics about the data stream. As underlying data streams, publicly available market data should be used [4], [5], [6], though commercial data integrations could also be optimised [7], [8]. If you are interested in low latency and high throughput data pipelines and want to gain experience with advanced optimization techniques, this might be the project for you! [1] https://github.com/invesdwin/invesdwin-context-integration#synchronous-channels [2] https://github.com/invesdwin/invesdwin-context-persistence#timeseries-module [3] https://kafka.apache.org/ [4] https://github.com/fxcm/ForexConnectAPI [5] https://www.alphavantage.co/documentation/ [6] https://www.dukascopy.com/wiki/en/development/strategy-api/historical-data [7] http://www.iqfeed.net/daytradersetups/index.cfm?displayaction=developer&section=main [8] https://api.tradestation.com/docs/fundamentals/http-streaming

Extend an AI agent for a simple Android-based boardgame
games ai
Hans Wolfgang Loidl
Edinburgh
Mobile devices increasingly attract attention as platforms for implementing boardgames. Such boardgames feature a principled game-design that offers high re-play value. An implementation on Android (or other mobile OSs) brings these games to a wide community of users, looking for simple, solitaire or networked games for entertainment. The goal of this project is to extend an existing AI agent for an existing Android implementation of the game "Guerilla Checkers" [1] by Brian Train [2]. This is a checkers-like boardgame with an asymmetric player set-up, a different victory conditions for each side. The current implementation [3], by Richard Gould, allows 2 players to play the game on an Android device. An existing AI, from a previous project, achieves basic game-play but is not competitive against a human player. The main goal is to enhance the AI to make it competitive, and (optionally) to compare it with alternative AI implementations for this game.

Extending a massively multi-player on-line RPG
games
Hans Wolfgang Loidl
Edinburgh
Role-playing games, set in an accurate historical context and supported by a scalable, distributed game engine, can provide an engaging learning environment for both players and game developers: players can learn about the historical and societal context of the game, and game developers can exercise modular design of a complex system in order to achieve scalability for a large number of players. The goal of this project is to extend the implementation of a previously developed core game engine. This involves adding in-game functionality of the basic game model, such as enhanced player interaction or more accurate modelling of battles, performance improvements to the core game engine, such as faster database access, and assessing the extended game engine in terms of latency, performance and scalability. The project will proceed in the following phases. <ul> <li> Literature survey and review of game model <li> Design of extensions to the core game engine <li> Implementation of extensions to the core game engine <li> Evaluation of latency, performance and scalability </ul>

Extending a massively multi-player on-line RPG
games
Hans Wolfgang Loidl
Edinburgh
Role-playing games, set in an accurate historical context and supported by a scalable, distributed game engine, can provide an engaging learning environment for both players and game developers: players can learn about the historical and societal context of the game, and game developers can exercise modular design of a complex system in order to achieve scalability for a large number of players. The goal of this project is to complete the implementation of a <a href="MScCoreGame.html">previously developed core game engine</a>. This involves adding in-game functionality of the basic game model, implementing a client-server API to allow interaction of players with the core game engine, and assessing the extended game engine in terms of latency, performance and scalability. The project will proceed in the following phases. <ul> <li> Literature survey and review of game model <li> Design of extensions to the core game engine, completing the game model <li> Implementation of extensions to the core game engine <li> Evaluation of latency, performance and scalability </ul>

First Class Serialization for Distributed Haskells
distributed programming
Hans Wolfgang Loidl
Edinburgh
Follow <a href="http://www.macs.hw.ac.uk/~hwloidl/MScProjects/MScFirstClassSer.html#spec">the link below</a> for a detailed discussion.

High-performance graph algorithms for social networks
Hans Wolfgang Loidl
Edinburgh
Relationships in social networks such as Facebook are typically captured as graphs with users as nodes and relationships as edges. Such graphs become huge when used in the context of social networks. Learning non-trivial relationships and trends in such networks is very time consuming and therefore needs efficient algorithms. In this project, application kernels should be developed for parallel graph algorithms on large graph structures, in order to learn new relationships. The core activity will be to implement parallel versions of the graph traversal algorithms and to assess performance. These application kernels should be implemented in an object-oriented language (eg. Java or C#) and in a functional language (eg. Haskell or ML). The performance of both implementations should be evaluated on a range of large input graphs.

Extend a novel board-game and develop an AI for it
games ai
Hans Wolfgang Loidl
Edinburgh
The board-game "Empires of the Skies" is a novel, unpublished board-game currently in design phase. Nowadays, playtesting of new board-games is often done online, using web-based platforms A prototype implementation for this game exists, as a browser game using Java- and Type-script. The goal of this project is to complete the implementation of the board-game "Empires of the Skies", to support the entire game, to develop a simple AI for this game, and then to evaluate both the implementation and the AI. As an optional goal, a simple AI should be implemented for the game, to allow single player usage and to facilitate playtesting.

Implement a simple boardgame and develop an AI for it
games ai
Hans Wolfgang Loidl
Edinburgh
The goal of this project is to first develop an implementation of a simple BOARD GAME (several options below) for Android-based tablet devices, or on a desktop/laptop, and then to DEVELOP an AI in the game for one of the factions. Choice of technologies is flexible, and should meet the requirements of casual tablet/laptop usage. There are several options of GAMES to implement: - "Kashmir Crisis" (https://brtrain.wordpress.com/2019/08/29/new-game-kashmir-crisis) - More Open Design games by Brian Train (https://brtrain.wordpress.com/free-games/) - Spellcast (https://www.andrew.cmu.edu/user/gc00/reviews/spellcaster.html) - Several games by game designer Neil McCormack - "Origins of World War I" (https://boardgamegeek.com/boardgame/17967/origins-world-war-i) - Schlieffen - Fire&Move - Agricola Express - More board-games and card-games can be discussed for implementation Whatever the game, the project will proceed in the following PHASES: - Literature survey on game design, machine learning and AI techniques; - Review of rules and game mechanisms of the boar-game - Implementation of the board-game as a (multi-player) tablet-based game - Design of basic AI for one of the factions - Implementation of the basic AI functionality - Design of improved AI functionality, drawing on machine-learning techniques - Implementation of improved AI functionality - Evaluation of the effectiveness of the improved vs. the basic functionality

Parallel programming on the Xeon Phi Many-core Coprocessor
parallel computing
Hans Wolfgang Loidl
Edinburgh
The new Intel Xeon Phi coprocessor is an accelerator card that promises to boost the performance of the host machine by offloading parallel code to a 61-core processor. It brings affordable many-core technology to standard desktop machines, and claims to be easier to program than other accelerators such GPGPUs. The goal of this project is to use a set of common parallel benchmark programs, to run them both on the Xeon Phi and on a departmental many-core server, in order to compare performance on both architectures. In a second phase, a simple parallel program should be developed from scratch, using the Xeon Phi's tool support, in order to assess programability of this new architecture.

Parallel symbolic computation on distributed memory machines
parallel computing
Hans Wolfgang Loidl
Edinburgh
Symbolic computation is characterised by performing compute-intensive operations on highly-structured, complex data. The parallelism in these applications is typically dynamic and irregular, i.e. it is generated throughout the computation and varies significantly in size. These characteristics make it difficult for conventional parallel programming languages. As a high-level parallel programming language, the pure, non-strict functional language Haskell will be used. It provides extensions to support both shared-memory and distributed-memory parallelism. The focus in this project is on the latter. This project will use the SymGrid-Par infrastructure for parallel programming, together with the GAP system for computational algebra, to implement one concrete symbolic application. Candidate applications come form the area of computational algebra, and include parallel resultant computation, squarefree factorisation and solving polynomial systems of equations. The thesis will report on the process of parallelising the application, reflect on the sources of parallelism, the suitability of the language and infrastructure for parallelisation, and assess the performance of the parallelised application on our Beowulf cluster.

Software Systems for Autonomous Cars
autonomous cars machine learning
Hans Wolfgang Loidl
Edinburgh
F1Tenth is a miniature race car, on a 1/10 scale, which is available in the department. The F1Tenth simulator (https://github.com/f1tenth/f1tenth_simulator), is a virtual environment for controlling this car. Control of the car builds on the ROS software which is identical to the real car software we use. In a previous project, a neural-network based AI for over-taking in the simulator has been developed. The goal in this project is to develop this AI further, extend the simulator, and test this environment for multiple cars at the same time. Optionally, the control software can be tested on the physical car as well.

Unity-based client for a massively multi-player on-line historical RPG
games
Hans Wolfgang Loidl
Edinburgh
Role-playing games, set in an accurate historical context and supported by a scalable, distributed game engine, can provide an engaging learning environment for both players and game developers: players can learn about the historical and societal context of the game, and game developers can exercise modular design of a complex system in order to achieve scalability for a large number of players. The objective of this project is to enhance a Unity-based client for an existing server for a historical role-playing game. This graphical client should enhance the current functionality of interacting with the game, and introduce new features to improve the general user experience, building on features provided by the Unity framework. The development of the clients should be modular, to maximise the re-use of code between the clients. The usability of these clients should be assessed through user surveys. The project will proceed in the following phases. Literature survey on game mechanics and usage aspects of game clients Design of the software architecure for all clients and delineation of differences Implementation of a text-based game client (for desktops) Implementation of a GUI game client (for desktops) Implementation of an handheld-based game client Assessment of all clients in terms of usability, flexibility and modularity

Advanced Docker usage
cloud services
Hans Wolfgang Loidl
Edinburgh
Docker [1] is a popular virtualisation technology in particular in the context of DevOps. It allows to manage Linux containers, each running its own image, and thereby isolating the software context in a software development project. In contrast to full virtualisation, as in VirtualBox or VMWare, docker images use para-virtualisation, sharing access to the same (host) OS kernel. Para-virtualisation achieves higher performance than full virtualisation, but also restricts the usage to running Linux inside Linux. The goal of this project is to develop teaching material covering: . underlying concepts of (para-)virtualisation . basic docker usage information . some advanced docker usage information . case study of docker usage in software development . or case study of using docker in teaching

Kubernetes usage
cloud services
Hans Wolfgang Loidl
Edinburgh
Kubernetes [1] is a popular Cloud management infrastructure, specifically for the deployment of Cloud applications. It provides location transparency, avoiding tying the running of an application to one particular machine, resilience and replication, to provide continuous, scalable execution of an application with high resource requirements. The goal of this master class, is to develop teaching material for the practical use of kubernetes, based on Googles web page below [1], and to develop a concrete case study that demonstrates the advantage of a web application running on this platform.

On-line board-game platforms as educational tools for web-development
games
Hans Wolfgang Loidl
Edinburgh
Several web-based platforms for implementing board-games have become very popular in particular during the lock-down period. These platforms typically use a mixture of web-based languages and infrastructures to provide an easy-to-use development platform. The goal of this project is to evaluate one or more of these platforms in terms of their provision of a web-based development platform, and the benefit the developer gets, in terms of a learning experience in web development, from using these tools. Some platforms are mentioned below

Using BoardGameArena for Serious Games
games
Hans Wolfgang Loidl
Edinburgh
Several game engines allow for easy development of computer games. One such platform is BoardGameArena [1], which specialises in the web-based development of digital browser-games using PHP and JavaScript as implementation languages. The goal of this project is to develop teaching material on how to use BoardGameArena to implement serious games, i.e. games that are designed and developed for a particular learning purpose. To this end a case study of implementing a simple board game should be performed. [1] BoardGameArena Studio https://studio.boardgamearena.com

Playful learning
gamification
Hans Wolfgang Loidl
Edinburgh
The concept of plaful learning uses gamification concepts to make the learning process more engaging. These can be simple, such as providing badges on achievements, or more advanced, such as space race functionality in quizzes. The main task for this master class is to develop teaching material that exemplifies the usage of platforms such as kahoot or TopHat for playful learning, with a critical reflection on advantages and disadvantages of these platforms.

The RISC-V architecture
computer architecture
Hans Wolfgang Loidl
Edinburgh
The RISC-V architecture is a new, open-hardware computer architecture that is increasingly popular and becoming a competitior to established architectures, e.g. ARM. It provides advantages in terms of modularity, flexibility, and open design. The main task in this master class is to give an overview of the main characteristics of this new architecture, mainly for programmers (rather than electrical engineers). This should also critical reflect on advantages and disadvantages. The practical part of the master class should provide some simple programming exercise that elaborates on the differences in architecture: this could use RISC-V vs ARM assembler.

Deep resource monitoring and machine-learning analysis
Hans Wolfgang Loidl
Edinburgh
The goal of the project is to develop tools for resource monitoring of a range of devices, perform the data collection based on these tools, and then to apply machine learning techniques about the resource usage to gain insight

Tools support to help people avoid machine learning pitfalls
machine learning large language models programming languages
Michael Lones
Edinburgh
Machine learning is great, but it’s really easy to make mistakes that invalidate the outcomes of the process. In science, this has contributed towards something known as the replicability crisis, where people publish papers based on the outcomes of applying machine learning, but due to mistakes they make in the machine learning process, the outcomes can’t then be reproduced by other people. This is a very common problem, even for top tier research published in journals such as Science and Nature. It also affects companies and people working in AI, who, after years of development, often find their models don’t work when they try to use them in the wild. This project aims to do something to reduce the number of mistakes that are being made. I’m open to ideas, but one approach might be to develop some kind of python tool that keeps an eye on what a data scientist is doing and warns them when they’re doing something that might effect the validity of their results. This might be a tool that runs in the python execution environment, or something that processes notebooks offline. It might use traditional approaches to parsing code and spotting errors, or could use a large language model to spot more subtle errors. A low-hanging fruit might be to look at preventing data leaks, which are a particularly common form of machine learning pitfall. Some reading to get started: How to avoid machine learning pitfalls: a guide for academic researchers - https://arxiv.org/abs/2108.02497 Leakage and the Reproducibility Crisis in ML-based Science - https://arxiv.org/abs/2207.07048 REFORMS: Reporting Standards for ML-based Science - https://reforms.cs.princeton.edu

Biologically-inspired computing projects
machine learning deep learning genetic algorithms cellular automata swarm computing
Michael Lones
Edinburgh
Biologically-inspired computing is any form of computing that attempts to mimic processes seen in biological systems in order to achieve some computational outcome. Biologically-inspired computing is interesting because it often solves problems in ways that are quite unintuitive to humans, and often in ways that are more efficient than more conventional computer science methods. Well known examples of bio-inspired computing are artificial neural networks (aka deep learning) and genetic algorithms, but in the 4th year and MSc bio-inspired computing course, we also cover other lesser-known techniques, such as swarm computing and cellular automata. These projects are for anyone who wants to explore the field of biologically-inspired computing and potentially do some original research or applied work in this area. Some examples of previous projects: - Using a genetic algorithm to optimise a pre-trained neural network. There days fine tuning pre-trained deep neural networks has become a pretty standard approach, but it’s still early days regarding how best to do the fine tuning. Genetic algorithms are interesting in this respect because they can explore a very wide space of solutions. - Looking at whether cellular automata can be used as reservoirs within a reservoir computer. Reservoir computers are pretty interesting things - take a look at echo state networks, for example, which are a kind of neural network that can be trained really quickly. - Looking at whether cellular automata can be used for image classification. This project involved using a genetic algorithm to evolve the rules used by a cellular automata. Cellular automata are pretty fascinating; they can solve complex problems using simple systems.

Applied machine learning projects
machine learning deep learning
Michael Lones
Edinburgh
This is for anyone who wants to get their teeth into a real world machine learning problem. In 4th year and MSc courses, you have the opportunity to learn the theory behind machine learning and try it out on some simple datasets. However, if you really want to get a leg up in the job market, it can be useful to show you have experience of applying machine learning and data science techniques to a real problem. If you’re interested in doing this project, please think about where you might focus before contacting me. If you’re interested in a particular problem domain, have a look which datasets are openly available. Some examples of previous projects: - Using machine learning for fraud detection in financial transactions. This used a couple of open access datasets and involved investigating data preprocessing, model selection and hyperparameter optimisation, which are three key stages of any machine learning pipeline. Explainable AI methods were then used to gain insight into how the models worked. - Using deep learning models to count fibres in images. This project focuses on developing a tool to help textiles researchers understand the ecological implications of textiles. It involved applying a range of different deep learning models to understand their applicability and limitations within this particular problem domain. - Applying machine learning to medical problems. I’ve supervised quite a few projects in this area, and they typically involve trying to come up with the best approach to solve some medical classification, regression or segmentation problem. Past examples include classifying Parkinson’s disease using drawing tablet data, using autoencoders to extract useful features from imaging data, and identifying abnormalities in images of tumours.

Developing demonstration apps for bio-inspired computing
genetic algorithms swarm computing cellular automata deep learning software engineering
Michael Lones
Edinburgh
We teach a variety of algorithms in the biologically-inspired computation course. It really helps to be able to visually show an algorithm in action by the means of a good demonstrator application. However, most implementations of algorithms are not designed with demonstration purposes in mind. To address this, these projects focus on developing an application that implements an algorithm (or family of algorithms), shows it solving some problems, and gives plenty of information about what's going on behind the scenes. For a good example, see this demonstration of ant colony optimisation that was put together by a project student in a previous year: https://emilietavernier.github.io/MSc_ACO_Project/#/ There are various algorithms that it would be good to have demonstrators for, including (but not limited to) particle swarm optimisation, cellular automata, and evolutionary algorithms.

LLMs for machine learning
llms generative ai machine learning
Michael Lones
Edinburgh
LLMs - and generative AI more generally - are gradually permeating many fields. However, although they are a product of machine learning, they are not typically used to do machine learning. This project would involve researching how LLMs can be used for machine learning, and practically demonstrating this on one or more machine learning tasks, to understand their benefits and limitations within this domain. It will likely involve using a locally-deployed open source model (e.g. Google's Gemma) and programmatically interacting with this using a language of your choice, though could also be done using a commercial model such as ChatGPT if you already subscribe to one of these.

Application or verification of Attack-Defense Trees
Manuel Maarek
Edinburgh
This project is to develop a tool for automatic generation of Attack Defense Trees, or an analysis tool for Attack-Defense Trees.

Code-based in-class engagement app
Manuel Maarek
Edinburgh
This project is to design and develop an in-class engagement mobile application around programming code. A Socrative-like app for code.

Compiler for Continuous Integration
Manuel Maarek
Edinburgh
Design and implement a compiler for complex development and code based operations. Continuous integrations system such as GitLab-CI would be the runtime target.

Covid-19 Tracking Apps: Investigating Development Practices
Manuel Maarek
Edinburgh
Covid-19 contact tracing apps have been developed with great urgency. Most of these apps rely on the Google/Apple Exposure Notification (GAEN) API. This project is to investigate these processes of development in relation to the API.

GitHub/GitLab programming game
Manuel Maarek
Edinburgh
This project is to develop a GitHub-based or GitLab-based programming game for users to improve their programming skills. The project is to use the GitHub API in the building of the game engine.

GitLab for security analysis (DevSecOps)
Manuel Maarek
Edinburgh
This project is to build a DevSecOps workflow extension to GitLab for security analysis. Such extension could take the form of adding Attack-Defence Tree modeling for security code review, or integrating a code static analyser such as Infer https://fbinfer.com/ for secure code analysis. This project is also a Master Class on existing features of security analysis existing within GitLab.

GitLab integration for code peer-testing
Manuel Maarek
Edinburgh
This project is to build a GitLab extension for peer-testing and peer-feedback of programming code. The project aims at providing a user-friendly solution for giving and receiving feedback on programming artifacts.

Implementing a Secure Application in Rust
Manuel Maarek
Edinburgh
Rust is a recent systems programming language that claims to run blazingly fast, and to prevent segfaults, and to guarantee thread safety. This project is to implement a secure application in Rust to evaluate its effectiveness in secure software development.

Investigate Terms and Conditions of Open APIs
Manuel Maarek
Edinburgh
Mobile and Web applications are making use of Open APIs to access services. These APIs come with a developer terms of service. This project is to investigate and compare terms of services of existing APIs.

Cyber Security Cards
Manuel Maarek
Edinburgh
This project is about the use of a deck of Cyber Security Cards. The current deck covers software security and is based on the CyBOK (Cybersecurity Body of Knowledge) https://www.cybok.org/ . The cards deck is both physical and digital. This project could take different directions. - Evaluate the deck of cards in a training or education context. - Extending the deck of cards from its current scope. - Linking or extending the deck towards other security knowledge bases. - Expand the uses of the decks with knowledge extension, linkage or augmented reality.

Security and Programming Languages
Manuel Maarek
Edinburgh
The choice of a programming language has major implications, including on security. This project is to investigate how secure or insecure a programming language is with regard to some known weaknesses (CWE, CAPEC, OWASP) by implementing a secure application. This project is also a Master Class on security features of programming languages.

Security Benchmark of XML Libraries
Manuel Maarek
Edinburgh
There exists a number of XML Libraries for various programming languages and platforms. Comparison of these XML Libraries exist but focus on speed and features. The aim of this project is to compare various XML parser implementation with regard to security issues (XML entity attacks, cyclic references, remote access, encoding based attacks, ...).

Serious Games for Cyber Security or Software Engineering
Manuel Maarek
Edinburgh
This project is to design, develop or evaluate serious games targeting cyber secutity or software engineering practices. The target audiance (novice/expert), the purpose (educational, training, awareness), and the type (digital, board) of the game is to be determined. This project is also a Master Class on teaching cyber security existing serious games for cyber security.

Static Security Analyser for OCaml
Manuel Maarek
Edinburgh
Typed functional programming languages offer great benefits in term of code safety and security. However some implementation details should be taken care of. This project involves implementing a verifier of programming rules for the OCaml language.

Robotics Security and Safety
Manuel Maarek
Edinburgh
This project is to explore the security and safety aspects of robotic system development and analysis. This project is also a Master Class on safety/security in Robotics.

CI/CD, DevOps, Cloud Orchestration languages
Manuel Maarek
Edinburgh
This project is to explore the languages of CI/CD, DevOps, Cloud Orchestration.

Static Application Security Testing (SAST)
Manuel Maarek
Edinburgh
The project is to study and develop strategies for Static Application Security Testing (SAST) using GitLab's Security Dashboard and SemGrep.

Evaluating GitLab-based Programming Education Workflows
Manuel Maarek
Edinburgh
A number of courses use GitLab-Student for their teaching assessment of programming-based courses. This project is to evaluate current or new education workflows. This could include feedback on code, tests (CI) outcomes, group or individual assessments, metrics, interactions with Canvas, interactions with other tools.

Teaching tool for studying Finite Automata and Regular Languages
finite state automata regular languages models of computation
Radu Mardare
Edinburgh
Finite automata (FA), in the form of deterministic finite automata (DFA) and non-deterministic finite automata (NFA), are a class of fundamental computational devices that recognize Regular Languages. They are intensively used in the practise of computer science, and they have been studied repeatedly in our courses. The aim of this project is to produce a teaching app that can help the students studying and manipulating FAs. The app should propose an appropriate interface where the students can design and work with FAs. There are many possible operations on FAs that can be automatized, such as: - The graphical construction of a DFA or NFA and its mathematical specification - Given a DFA or NFA, the simulation of computations on particular inputs - The determinization of a NFA - Computing operations with FA – Union, Concatenation, Star, Intersection, Complement - Computing the regular expression that characterizes the language recognized by a FA - Construct a FA that recognizes a given regular expression - Construct a FA for the reversal of the language recognised by a given FA - Minimizing a DFA

The equivalence of Epistemic Systems
epistemic systems epistemic games epistemic logic transition systems modal logic
Radu Mardare
Edinburgh
Epistemic systems are systems of agents witnessing a certain reality and computing knowledge about it. They are intensively used in modelling, for instance security systems where one wants to understand and control the information accessed by certain agents active on a network. Due to its relevance in applications, the field of epistemic logic had a considerable evolution in the last decades. This project aims at developing a couple of fundamental concepts of equivalence that might be relevant for epistemic systems. For instance, (i) what does it means for two agents to have equivalent knowledge? or (ii) what does it means for two societies of agents to have equivalent knowledge? or even (iii) what makes two epistemic systems equivalent? To argue for possible answers to these questions one can make use of fragments of epistemic logic. This project can be developed either as a theoretical work, or it can focus on implementing some dedicated algorithms related to the aforementioned problems.

Simulating and predicting the knowledge of agents in Epistemic Systems
epistemic systems epistemic games epistemic logic transition systems modal logic
Radu Mardare
Edinburgh
Epistemic systems are systems of agents witnessing a certain reality and computing knowledge about it. They are intensively used in modelling, for instance of security systems where one wants to understand and control the information accessed by certain agents active on a network. Due to its relevance in applications, the field of epistemic logic had a considerable evolution in the last decades. This project aims at developing concepts of simulation that can be useful in applications. For instance, what does it mean that an agent can simulate another agent? This can be a useful concept in security, where dissimulated behaviours can be used to avoid security checks. This project can be developed either as a theoretical work, or it can focus on implementing some dedicated algorithms related to the aforementioned problems.

Using a Brain-Computer Interface with a Human Support Robot for Object Grasping
eeg bci robotics
Alistair Mcconnell
Edinburgh
Using a Brain-Computer Interface (BCI) with a robotic platform for Object Grasping. For example, the HSR robot is a capable assistive robot and has been shown to be used for basic object grasping and fetching. By using an EEG headset you could trigger the HSR to grasp a specific object.

Evolutionary Approach To Soft Robotic Design
evolutionary algorithms soft robotics
Alistair Mcconnell
Edinburgh
Soft robotics is a relatively novel field of robotic design and development and due to its bio-inspired nature there are a vast number of permutations of simplistic soft robots. This project would involve using evolutionary algorithms to evolve a Voxel-based soft robot design.

Path Planning Optimisation of a Robotic Platform in Simulated Radiation Contaminated Environment
path planning optimisation ros gazebo robotics
Alistair Mcconnell
Edinburgh
Robots have the potential to remove humans from hazardous environments, one such environment could be a radiation-contaminated building. Radiation is still a danger to robots and overexposure can lead to damaged equipment. This project would be to work on path planning optimisation using a mobile robotic platform to navigate a room with simulated radioactive hot spots. Suggested background reading: "vPlanSim: An Open Source Graphical Interface for the Visualisation and Simulation of AI Systems by Jamie O. Roberts, Georgios Mastorakis, Brad Lazaruk, Santiago Franco, Adam A. Stokes, Sara Bernardini" and "Modular Robots for Enabling Operations in Unstructured Extreme Environments by Mohammed E Sayed, Jamie O Roberts, Karen Donaldson, Stephen T Mahon, Faiz Iqbal, Boyang Li, Santiago Franco Aixela, Georgios Mastorakis, Emil T Jonasson, Markus P Nemitz, Sara Bernardini, Adam A Stokes" Desire to work in ROS and in Gazebo is required

Environmental Monitoring Sensor Network using LoRaWAN
lorawan environmental sensors tinyml
Alistair Mcconnell
Edinburgh
The objective of this project will be to help Heriot-Watt University create an environmental sensor network at the Edinburgh Campus through the use of a LoRaWAN network. The data should be displayed using an intuitive UI for users to monitor. Work will be done on using edge processing on the sensors to maximise the sensor's battery life and compensate for the LoRa Bandwidth restrictions.

Road/Infrastructure Quality Monitoring Using IMU Crowdsourced Data Gathering
android app crowdsourcing data data visualisation
Alistair Mcconnell
Edinburgh
There are approximately 247,800 miles of road in the UK [1] which can vary in quality from brand new to close to destruction. There are also approximately 72,000 bridges in the UK and recent studies put around 4.4% of those as substandard [2]. It is close to impossible to accurately monitor all of this infrastructure and the cost of embedding sensors can be prohibitive, it has been suggested that crowdsourced IMU data could be used to passively monitor infrastructure as it is traversed by cars. This project would involve creating an application to use a phone IMU and GPS to monitor vehicle journeys and create a database and visualisation that could show the road quality.

Robotic Arms for Everyone!
robotics remote labs user evaluation
Alistair Mcconnell
Edinburgh
Space can be expensive and hard to get, robots are also expensive and hard to get. Both of these factors makes running robotic labs a complex and difficult venture. One way we can tackle this problem is through the implementation of Remote Labs. The project would involve: 1) The integration of a small robotic arm and the Practable.io backend 2) The testing and evaluation of the system with a suitable group of participants

Development of Software for Customised 3D Printed Prosthesis
prosthetics parametric cad
Alistair Mcconnell
Edinburgh
Development of a parametric algorithm that can automatically scale and generate a new prosthetic hand for a user. To make it more challenging, this could also incorporate the motors, power, etc, required for an active prosthetic.

Synchronised Swarm Behaviour Using MiRo-E Robots
swarm robotics
Alistair Mcconnell
Edinburgh
MiRo-E robots are bioinspired robots created in part to be companion robots and inspired by many animals. This project would involve the programming of a swarm of MiRo-E robots to perform a set of routines.

Optimise Your Exercise: Workout and Nutrition Planner
Alistair Mcconnell
Edinburgh
The exercise market is ever-increasing, and multiple applications exist already for fitness tracking and planning. But often, these are incomplete or have a poor interface. This project is to create a novel Workout and Nutrition Planner.

B"ohm trees
James McKinna
Edinburgh
B"ohm trees are infinite objects corresponding to the successive 'unfolding' of a term in lambda calculus; they can be regarded either as a possible semantics of lambda terms, or as an extended language of terms with extended notions of the usual reduction relations on lambda terms. This project, which can be appropriately scoped according to the level and skills of the student, is to study formal representations of lambda terms in a system such as the Agda implementation of type theory. Aspects of the theory of B"ohm trees include: * continuity properties (trees can be given a topology for which application and lambda abstraction are continuous) * extending the Standardisation theorem for ordinary lambda calculus to the extended reduction system on trees; * other topics as they may arise The interested student should have a good mathematical background; an interest in (the foundations of) programming languages; preferably direct experience in the form of our CS Foundations I and II courses

Gradual typing in Python
James McKinna
Edinburgh
Python is a dynamically typed language with wide application in many data-intensive areas. As such, it has no built-in support for type checking, although users may document their code with the intended types; at present these are unchecked. The aim of this project is to investigate adding types and type checking to an existing python implementation, with a view to evaluating the behaviour and performance of existing python code under a (more strongly) typed discipline. There is considerable scope for theoretical and practical investigations in this area; interested students should discuss with me how they would like to approach the project. Existing work on 'Reticulated Python' may be of use as background material.

Interactive Theorem Proving in Agda
James McKinna
Edinburgh
The Fundamental Theorem of Arithmetic (FTA), is as its name implies, one of the elementary cornerstones of number theory. Agda is an interactive theorem prover based on intuitionistic type theory, with a rich and highly developed library of elementary mathematical and computational structures. A glaring omission from the library is a complete formalisation of FTA. A concrete objective would be a complete proof of this result, and its successful incorporation into the library.

programming languages in the K framework
James McKinna
Edinburgh
The K framework https://kframework.org/ (with additional tooling at https://github.com/runtimeverification/k) is, in the words of its developers, "a rewrite-based executable semantic framework in which programming languages, type systems and formal analysis tools can be defined using configurations and rules. " This project concerns doing experiments with representation of, and perhaps reasoning with, programming languages not already supported by K. We also have the offer of services from runtimeverification.com for technical advice and support. Further details of the scope and difficulty of this project available on request. There is room for >1 student to work with me on this.

Simplicial Objects in Dependent Type Theory
James McKinna
Edinburgh
There has been much recent interest from the mathematics community in so-called *cubical type theory*, a development of intuitionistic type theory in which homotopy-theoretic structure and results may be developed synthetically. Classical structures in homotopy theory include the so called simplicial category, %Delta$ and simnplicial structures, considered as presheaves on $Delta$ with values in suitable categories of interest. Mac lane's "Categories for the Working Mathematician" contains the essential material with which to begin the project. The aim of this project is to develop the general theory of such structures in the Agda theorem prover, an implementation of intuitionistic type theory, and specifically to do so on top of the existing standard library Data.Fin of the dependent family `Fin n` (for `n : Nat`) of finite types. This project is suitable for mathematics students, and mathematically inclined computer science students interested in developing a chapter of classical mathematics in a formalised setting.

XAI: explainable search procedures
James McKinna
Edinburgh
Some classical AI logic/'puzzle solving' problems were some of the earliest search problems for which heuristic-guided search procedures were developed; some such procedures may be described in terms of (variously elaborate) proof systems for derivation in suitable logics (as indeed, can classical proof-search procedures). But most puzzle-solving algorithms merely compute solutions, rather than explanations to the human consumer of hoe those solutions were arrived at. The aim of this project is to take an existing solver (or else write one of your own) for a given puzzle game (such as Sudoku etc.) and instrument in such a way as to produce interactive explanations of how to solve a problem instance. There are lots of avenues in which such a project could then be taken: user studies of the appropriateness/intelligibility of the computed explanations; various aspects of additional learning/increase in the expressive power of explanations; etc. More than one student could do this project; but they would need to work on different puzzles/solvers.

Computer-assisted financial trading
Radu-Casian Mihailescu
Dubai
The goal of this project is to leverage Machine Learning approaches to create an intelligent software agent that can teach itself and adapt to market conditions in order to generate successful trading strategies in the stock market. The developed algorithms will be evaluated in real-world set-ups.

Context-aware object detection in computer vision
Radu-Casian Mihailescu
Dubai
Surveillance cameras are typically placed in different contexts which are not known beforehand and that may change dynamically. For example, cameras should be able to cope with background variations such as light changes, weather and seasons in the specific scene, as well as to improve performance with time while it is adapting to the user's scene. Thus, in order to contribute optimally to realizing the system goals, they should be able to adapt to the context they are placed in and its current state. In this work we are going to look into various contextual data from the environment and design algorithms that can leverage this information in order to improve classification performance.

Fake news detection - a machine learning approach
Radu-Casian Mihailescu
Dubai
The digital media landscape has been exposed in recent years to an increasing number of deliberately misleading news and disinformation campaigns, a phenomenon popularly referred as fake news. In an effort to combat the dissemination of fake news, designing machine learning models that can classify text as fake or not has become an active line of research. In this work we will investigate viable machine learning approaches for the task of fake news detection.

Intelligent decision-making model for energy consumption in a smart building
Radu-Casian Mihailescu
Dubai
Within the worldwide perspective of energy efficiency, it is important to highlight that buildings are responsible for 40% of total European energy consumption, which has a contribution of 36% towards greenhouse gas emissions. Since buildings are large contributors of greenhouse gases, it is critical to develop solutions that improves energy savings and achieve sustainability goals in the development of cities. In this project, development of an Internet of Energy (IoE) management system in a smart building is expected for intelligent decision making processes. Internet of energy management systems by collecting sensory data and analyzing them with machine learning techniques will make intelligent decisions to improve energy consumption of appliances in one or several rooms in a smart building. The aim of this project is to propose a model based on the preferences and behavioural habits of the people that live in households and the interdependency of the appliances that are active at the moment. The dataset or collected data will be analyzed in order to provide a suggestion for optimization of the electricity consumption per appliance.

Synthetic dataset generation
Radu-Casian Mihailescu
Dubai
Synthetics data generation has the potential to become an excellent source of ground truth for many computer vision applications. However, the gap between real and synthetic data remains a problem that we are going to address in this task. The goal is to use procedurally-generated data models to synthesize datasets with minimal domain gap.

Image analysis for satellite data
Radu-Casian Mihailescu
Dubai
In this project we will investigate various deep learning neural network architectures for image analysis on satellite data. The aim is providing actionable insights from analyzing the remote sensing data. Use cases may include one or more of the following: Qualitative analysis 1. Vegetation quality 2. Soil quality 3. Sand dune movements patterns 4. Oil spills near the sea 5. Monitoring temperature of seawater near power stations/cooling purposes/Independent sensors 6. Monitoring heat leakage in residential/commercial buildings 7. Monitoring of solar panel conditions 8. Monitoring coastal changes 9. Land use/Land cover Quantitative analysis 1. Detecting Number of Buildings 2. Cars, People etc. 3. Monitoring above the ground high voltage installation 4. Mapping of Geotechnical Investigation 5. Building permit verification 6. Base map updating 7. Disaster mitigation planning 8. Counting Palm Trees 9. Monitoring city night lights 10. Monitoring of building usage 11. Monitoring construction progress This project may involve collaboration with Eaglei71.

Domain adaptation in computer vision
Radu-Casian Mihailescu
Dubai
Perform domain adaptation from photometric images (visible imager) to radiometric (thermal imager) using a neural network. Proposed methods are (but not limited to): a. self supervision with an attention network b. supervised learning with a GAN Goal of the thesis is good features representation capability of the network. This work will be carried out in collaboration with the Technology Innovation Institute (Masdar City).

Forecasting Energy Consumption using Machine Learning
Radu-Casian Mihailescu
Dubai
Forecasting electricity demand accurately is a critical part in ensuring optimized and cost-effective operation, especially in the context of smart buildings (office, commercial or household). Various Machine learning techniques will be implemented and evaluated comparatively within this project. The project will investigate consumption prediction for different time horizons and at various levels of aggregation, customer profiling and segmentation, as well as including work on exploratory data analysis and different visualisations techniques. Possible collaboration and internship with RTA (Details to be communicated later). Could entail working with real data and might require meetings with external stakeholder.

Evaluation of ChatGPT and related LLMs
Radu-Casian Mihailescu
Dubai
The goal of this project is to conduct a comprehensive quantitative evaluation of ChatGPT and related LLMs using publicly available datasets on various NLP tasks such as question-answering, summarisation, information extraction, natural language generation, etc.

Deep Reinforcement Learning Applications
Radu-Casian Mihailescu
Dubai
(Deep)Reinforcement Learning (RL), agents are trained on a reward and punishment mechanism. The agent is rewarded for correct moves and punished for the wrong ones. In doing so, the agent tries to minimize wrong moves and maximize the right ones. The general framework of RL makes it suitable for a large number of applications including autonomous driving, financial services, healthcare, natural language processing, etc. The aim of this projects is to develop and evaluate an RL agent for a specific application domain.

Machine Learning for Biomedical Data and Personalised Healthcare
Radu-Casian Mihailescu
Dubai
Machine Learning and AI is gradually transforming healthcare services by offering diagnosis and information tools that enable individualised patient management. Examples include, providing personalised treatment recommendations for patients about the right drug and the right dose, as well as customised recommendations based on the person’s lifestyle and behaviour, in order to prevent disease and get ahead of problems that could be troublesome down the road. In this context, machine learning approaches based on medical dataset are proving to be a highly effective strategy in addressing these challenges. The thesis work will focus either at providing a comprehensive study of the application of machine learning to personalised healthcare or will be addressing more in-depth a specific topic such as drug discovery or image processing for automated diagnosis.

Federated learning systems for computer vision
Radu-Casian Mihailescu
Dubai
The implementation specifications for FL frameworks can vary over a significantly large design space, based on the intended properties and parameterization of the models. In this task we are going to systematically analyse the importance of different quality characteristics as it pertains to key computer vision tasks.

Thermal Image Stitching - Collaboration with TII
Radu-Casian Mihailescu
Dubai
This master’s thesis proposal focuses on normalizing of individual thermal images obtained from a drone mapping process. Then, images are supposed to be stitched using a 3rd party software such as Agisoft’s Metashape. The research aims to address the training of a GAN for normalizing of the thermal (long wave infrared) images. The challenge is to propose a suitable training procedure that minimizes errors in GAN-generated images and produces a homogeneous photorealistic map. Data is already available

Visual Representation Learning - Collaboration with TII
Radu-Casian Mihailescu
Dubai
This master’s thesis proposal explores the use of transfer learning to seamlessly and efficiently combine two (or more) different camera modalities. Concretely, having a dataset containing RGB images and a dataset containing the correlated thermal images, this proposal has the goal of training a representation embedding for the RGB images (e.g. training an encoder-decoder architecture and then only considering the encoder module), and a representation embedding for the thermal images. The latent spaces containing the RGB and thermal images are then linked together (see “ASIF: Coupled Data Turns Unimodal Model to Multimodal Without Training” [1]). The outcome of the proposal is to use these bridged latent spaces to be able to link any previously unseen RGB image to a subset of previously unseen thermal images The project includes collecting datasets from relevant sources, perform the necessary literature review, and using the aforementioned pipeline to provide simulated and experimental results with (simulated and/or) real data.

Neural Depth Estimation - Collaboration with TII
Radu-Casian Mihailescu
Dubai
This master’s thesis proposal explores the use of a combination of analytical and deep learning-based solutions to obtain accurate monocular metric depth estimation from a sequence of frames. The research aims to address the need of a reliant and robust depth estimator for on-board integration on computationally constrained platforms for online metric depth estimation. The project includes collecting datasets from relevant sources, perform the necessary literature review, and using established computer vision and machine learning techniques to implement a working demo. The expected architecture will be a hybrid analytical-deep learning depth estimation pipeline, combining traditional well-established techniques with state-of-the-art deep learning approaches to improve the performance and accuracy of metric depth estimation. More specifically, the first stage will be tasked of computing a sparse depth matrix –D*- using the inverse pinhole camera model with established sparse optical flow techniques (e.g. Kanade-Shi-Tomasi, or Deep Patch optical flow). The second stage will be a neural network (e.g. ViT, TCN) that considers as input a sequence of RGB-D* frames and outputs the current metric depth map. The final architecture will be tested in a fixed-wing drone scenario flying at ~200 meters of altitude.

Solar Photovoltaic Characterisation & Yield Prediction
Radu-Casian Mihailescu
Dubai
Renewable Energy is generally capital intensive relative to its cost of operation and maintenance. The cost of raising capital therefore has a significant impact on the Levelised Cost of Energy (LCOE). Higher confidence of yield prediction throughout the expected life of a renewable energy plan therefore can drive down the cost of energy. This project will train artificial neural networks to better predict energy yield in varying environmental conditions (irradiance, air mass, ambient and panel temperatures, etc) and over long periods of time, to account for cleaning schedules and degradation rates.

Influencing User Behaviour to “Be Green”
Radu-Casian Mihailescu
Dubai
What initiatives and technology can influence occupant behaviour to reduce energy consumption, carbon footprint and improve overall sustainability. If technology is to bridge the gap between renewable energy resource and service level need, then what behaviour changes will help to narrow the gap? And can this be achieved without substantially reducing service standards? This project is likely to assess the service level needs of HWU Dubai staff and students and identify opportunities to “behave” more sustainably. The role of technology to inform sustainable choices and the potential to incentivise sustainable behaviour will also be assessed.

Explainable AI for Deep Learning Models: Better understanding of their decision-making processes
Radu-Casian Mihailescu
Dubai
Deep neural networks are very complex, and their decisions can be hard to interpret. In this project we want to use Explainable AI (XAI) to understand why a deep neural network makes a classification decision or prediction. XAI can be used to determine the importance of features of the input data, as a proxy for the importance of the features to the deep neural network. It has a wide range of applications across various domains where the transparency, interpretability, and accountability of AI systems are essential such as health, finance, or autonomous systems.

Federated Learning for IoT
Radu-Casian Mihailescu
Dubai
Federated Learning (FL) is a decentralised approach / technique to train Machine Learning (ML) models that are distributed at the edge of the network. FL aims to enable multiple actors to build a common and robust ML model over local datasets (i.e., without shring data). A number of FL frameworks exist showing different charachteristics. This project aims to conduct experiments and evaluate some of the most popular open-source FL frameworks against criteria like: performances (complete task per time unit, aka, throughput), resources consumption (CPU, Memory, GPU), convergence, deployment effort, flexibility, accuracy, and scalability.

Interactive learning with LLMs
Radu-Casian Mihailescu
Dubai
The projects aims to investigate the deployment of LLMs via efficient interactive learning strategies in order to support and enhance human learning environments. Sub-topics that will be investigated throughout the project include: - Adaptive learning experience by providing personalised feedback to learners and adapting to different learning styles, question & answer scenarios, etc. - Educational content generation: investigate the capabilities of LLM to produce (customised) material in the form of practice exercise and relevant content for specific domains The outcome of the project is to implement and design an interactive assistant/tutor and to provide an analysis of the potential use of LLM in educational setups.

An investigation of Bias and Fairness in LLM generated content
Radu-Casian Mihailescu
Dubai
It has well been documented that LLM have the propensity towards hallucinations i.e. generating non-factual data, as well as being biased in their output due to the nature of their training data and/or training procedures. The goal of this project is two-fold: -develop methods and tools aimed towards detecting the occurrence of biased output/hallucinations in LLMs -to propose and experiment with different techniques and methodologies designed to mitigate bias

Enhancing LLM reasoning by integrating formal planners and theorem provers
Radu-Casian Mihailescu
Dubai
On the one hand, LLMs are currently the state-of-the-art method in terms understanding and generating natural language, however they are inherently limited for tasks that require rigorous formal reasoning or planning. On the other hand, tools such as theorem provers and formal planners are specialised in handling precise logical operations or sequential decision-making based on formal rules. The scope of this project is to combine the strength of both approaches into a hybrid model that brings together LLMs with formal reasoning tools. The proposed approach will be evaluated in the context of complex problem-solving tasks.

Efficient training techniques for fine-tuning LLMs to domain-specific applications
Radu-Casian Mihailescu
Dubai
Fine-tuning LLMs involves adapting a pre-trained model for a specific downstream task, with applications to various domains such as medical, financial, educational, or legal. Several parameter-efficient methods have been proposed in order to deal with reducing computational resources during training, such as: - Adapters: small neural network modules are inserted into each layer of a pertained model - Low-Rank Adaptation: training smaller low-rank matrices that are added to the existing weights - Prefix-tuning: adjusting the prefix embeddings without altering the model's weights -Sparse Fine-tuning: training only a subset of the model's parameters based on their relevance for a specific task The goal of the project is to conduct a comprehensive comparative study of the different approaches, and provide insights into the strength and drawbacks of such methods in different contexts.

Evaluating Computational Creativity in Large Language Models (LLMs)
Radu-Casian Mihailescu
Dubai
Over the past years, LLMs have demonstrated impressive performance across various Natural Language Processing tasks. However, evaluating creativity of LLMs remains a challenging topic due to the subjective nature of creativity itself, as well as the lack of clear-cut metrics and benchmarking datasets. The aim of this project is to propose a framework for assessing the creative abilities of LLMs, drawing attention to the capabilities and limitations across a number of state-of-the-art LLMs.

Machine Learning for 3D Tooth Segmentation on Cone-beam Computed Tomography(CBCT) Image Data
Radu-Casian Mihailescu
Dubai
The aim of the study is to utilise Deep Learning image processing techniques in a unique CBCT dataset to evaluate adverse effects on the bone tissue surrounding the teeth following orthodontic treatment. In the study, AI assisted interpretation of CBCT images and clinical data will be employed to identify biomarkers capable of predicting which patients are suitable for various orthodontic treatments and which are at increased risk of adverse effects and relapse after orthodontic treatment. The project involves collaboration with medical specialist. The outcome of the project involves developing and AI-driven tool for automated tooth segmentation on CBCT imaging.

IoT-based Healthcare Wearables
assisted technology smart living
Chit Su Mon
Malaysia
Creating an assisted living project involves careful research, thoughtful design, and effective implementation to address the needs of the population and improve their quality of life. Develop wearable devices that collect and transmit health-related data, such as heart rate and body temperature, to a central monitoring system.

AI driven Interactive Coding Tutor
ai interactive coding tutor
Chit Su Mon
Malaysia
Design a user interface using React or Vue.js or integrate code editor. For the backend, familiar with AI models, Database and APIs. AI for Code Assistance: Implement AI models to provide real-time code suggestions, auto-completion, and error correction (Basics of Coding). Interactive Coding Exercises: Create a variety of coding exercises and integrate real-time error checking and provide hints or suggestions for improvements.

Deep Learning-based Image Recognition for Medical Diagnostics
deep learning cnn medical diagnosing image recognition
Mahmoud Mousa
Dubai
This project focuses on leveraging the power of deep learning techniques for medical image recognition and diagnosis. The project will involve collecting or utilizing existing medical image datasets, preprocessing the images, and training a CNN architecture, for example, to learn meaningful representations and features from the images. The model will then be evaluated on a separate test set to assess its performance in detecting and classifying diseases or abnormalities accurately.

Proposing approximation algorithms for NP-complete problems such as Knapsack Problem.
optimal algorithms approximation algorithms np-complete knapsack problem optimisation
Mahmoud Mousa
Dubai
The goal of this project is to study an NP-complete problem such as Knapsack problem and proposing algorithms to find optimal and approximate solutions for this problem. The optimal algorithms for NP-complete problems usually take a long time, could be exponential complexity, to generate optimal solutions, especially for hard instance datasets. The aim is to design approximate algorithms to find near optimal answers in polynomial time and compare the performance of the suggested approximation algorithm(s) to the optimal one(s).

Building Trust and Reputation System using Blockchain and Deep learning Approaches
trust and reputation systems blockchain artificial intelligence smart contracts
Mahmoud Mousa
Dubai
The goal is to use the blockchain as a trustless platform to create a trust and reputation system to assess several services based on questionnaires. Those questionnaires will be filled by different users. The aim is to calculate the digital trust values for each service which reflects the users' satisfaction levels. You can use subjective logic or deep learning approaches written on smart contracts over the blockchain to extract the users' opinions which update the trust and reputation system you build.

Sign language Recognizer for English language
sign language recognizer computer vision deep learning
Mahmoud Mousa
Dubai
You will need to collect videos of people signing and their corresponding text transcriptions or use an open-source dataset from the web. You can then use this data to train a machine learning model to recognize sign language. This project will be done on two stages: The first stage is to use images to train your model so that it recognizes the sign language represented by each picture. Next, you need to consider inputting videos, selecting frames, processing it and outputting the recognized English description. Furthermore, the output description could be translated into voice using known APIs.

Handwritten Text Recognition
handwritten text recognition deep-learning image recognition cnn
Mahmoud Mousa
Dubai
Handwritten text recognition is a challenging task due to the complexity of the script and diversity of handwritings. However, deep learning techniques have enabled significant progress recently. To develop an HTR model, researchers collect a large dataset of handwritten text images written in specific language and corresponding transcripts. You preprocess the images to improve accuracy and extract features from the images. Then you train a deep learning model like CNNs or RNNs on the features. The trained model is evaluated on a test set to measure accuracy. Once developed, the HTR model can be deployed in real-world applications like mobile apps or web services to recognize handwritten text.

Generic Computer Vision Project
Mahmoud Mousa
Dubai
Problems similar to the following could be considered. - emotion detection for masked/unmasked faces. - Object recognition - Face recognition for masked faces.

A comparative study on Signature Recognition
Mahmoud Mousa
Dubai
the problem involves analyzing a person's handwritten signature from a dataset to verify its identity. The problem could be solved by involving Feature Extraction, Template Creation, and Matching approaches.

Machine Learning Based Malware Detection
Ali Muzaffar
Dubai

Android Application Analysis Tool
Ali Muzaffar
Dubai

Online Learning Based Malware Detection
Ali Muzaffar
Dubai
The project will be based on Online (stream) learning-based malware detection. Focus may be on Windows, Linux, or Android-based malware detection.

Customer Profiling and Sentiment Analysis for E-commerce customers
Ali Muzaffar
Dubai

LLM based malware detection
Ali Muzaffar
Dubai
The project will explore the use of LLM in malware detection. The focus can be on Windows, Android or Linux based malware detection.

Android application dataset exploration
Ali Muzaffar
Dubai

Faster LTL to Parity Automata Translation for Faster Rational Verification
formal verification model checking multiagent systems game theory
Muhammad Najib
Edinburgh
Rational verification is the problem of checking whether a given temporal logic formula Ï• is satisfied in some or all game-theoretic equilibria of a multi-agent system. EVE (Equilibrium Verification Environment) is a tool for rational verification. In this project, you will modify EVE (https://github.com/eve-mas/eve-parity) to work with faster LTL to Parity Automata translator, e.g., Owl (https://owl.model.in.tum.de/).

Faster Parity Games Solver for Faster Rational Verification
formal verification model checking multiagent systems game theory
Muhammad Najib
Edinburgh
Rational verification is the problem of checking whether a given temporal logic formula Ï• is satisfied in some or all game-theoretic equilibria of a multi-agent system. EVE (Equilibrium Verification Environment) is a tool for rational verification. In this project, you will modify EVE (https://github.com/eve-mas/eve-parity) to work with faster parity games solver, e.g., Oink (https://github.com/trolando/oink). EVE currently uses PGSolver (https://github.com/tcsprojects/pgsolver).

Rational Verification for Mean-Payoff Games
formal verification model checking multiagent systems game theory
Muhammad Najib
Edinburgh
Rational verification is the problem of checking whether a given temporal logic formula Ï• is satisfied in some or all game-theoretic equilibria of a multi-agent system. EVE (Equilibrium Verification Environment) is a tool for rational verification. Currently, EVE only support games with LTL objectives. In this project, you will extend EVE to support games with mean-payoff objectives. In particular, you will do the following - Extend EVE (https://github.com/eve-mas/eve-parity) to multi-player mean-payoff games - Implement algorithm in [1] to solve relevant decision problems - Integrate mean-payoff solver to EVE, e.g., https://github.com/romainbrenguier/MeanPayoffSolver Reference: [1] Gutierrez, Julian, et al. "On computational tractability for rational verification." International Joint Conferences on Artificial Intelligence, 2019.

Game-theoretical Analysis of Multi-Agent Systems with PRISM-games
Muhammad Najib
Edinburgh
PRISM-games is an extension of PRISM, designed for the verification of probabilistic systems. These systems can incorporate either competitive or collaborative behaviour, modelled as stochastic multi-player games. In this project, you will explore a realistic scenario of a multi-agent system and model it using PRISM-games. Subsequently, you will conduct an analysis of its game-theoretical properties, such as Nash equilibria.

XAI for Explainable Rational Synthesis
Muhammad Najib
Edinburgh
EVE (https://github.com/eve-mas/eve-parity/) is a tool that can be used to synthesise strategies in multi-agent systems, which are modelled as concurrent games. These synthesised strategies can be viewed as conditional plans for the agents. In this project, you will employ approaches from explainable planning (XAIP) to elucidate the synthesised strategies, thereby enhancing their comprehensibility.

Experimenting quantum algorithms using IBM quantum devices
quantum computing qiskit
Kai Lin Ong
Malaysia
Student will have the opportunity to demonstrate mastery of the following: - Fundamental concepts of quantum computing and selected quantum algorithms. - The quantum workflow of running IBM devices. - Quantum circuit design in simulating quantum algorithms efficiently using IBM devices. - Post-processing and result visualization

Quantum walks and their applications
quantum walks search algorithms quantum computing
Kai Lin Ong
Malaysia
Random walks are stochastic processes defined on some mathematical state space, consisting of a sequence of steps described by independent identically distributed random variables. Numerous algorithms designed using classical random walks were developed where each has its areas of importance, contributing to the emergence of various applications in computing and technological fields. Quantum walks are quantum analogues of classical random walks. The core difference is that unlike classical random walks where randomness arises from the transition probability between different states, quantum walks exhibit randomness via quantum mechanical properties such as superposition and the measurement postulate. This resulted in them to have greater advantage over its classical counterpart, for instance, with exponentially faster hitting times. This project is devoted to studying quantum walks comprehensively and their applications in a selected field, supported by some simulations using Qiskit Python or other suitable programming. Some well-known applications are in developing search algorithms and quantum cryptography.

Artificial Immune Systems
bio-inspired computing machine learning deep learning
Wei Pang
Edinburgh
The human immune system can effectively protect us from viruses and other invaders. Algorithms inspired by this have been developed, the so-called artificial immune systems (AIS) You are free to explore any ideas or existing immune-inspired algorithms to solve any problems you like. Previous projects include using AIS for the following problems (you can do similar but not the same, or select any problems that interest you): 1. Searching best neural network architecture 2. Playing games 3. Make Machine Learning fair 4. Anomaly detection 5. Cybersecurity

Fair Machine Learning
machine learning trustworthy ai fairness responsible ai
Wei Pang
Edinburgh
Machine Learning may make unfair decisions, for example, it may have bias toward ethnic minorities and/or vulnerable groups. In this research, we would like to investigate how to mitigate the discrimination of machine learning algorithms.

Machine Learning related project
machine learning real-world applications
Wei Pang
Edinburgh
AI and ML-related project.

Population-based explainable Machine Learning
xai explainable ai
Wei Pang
Edinburgh
We would like to improve LIME (https://github.com/marcotcr/lime), a framework for explaining any classifiers using population-based approaches, such as swarm and genetic algorithms. You can also explore ways of improving other explainable AI approaches through the use of population-based approaches.

Swarm Intelligence for Unsupervised Learning
Wei Pang
Edinburgh
You can proposal any novel ideas on using swarm intelligence to perform data clustering, including clustering mixed data, dynamic data, or networked data. Examples: 1. J. Ji, W. Pang, Z. Li, F. He, G. Feng and X. Zhao, "Clustering Mixed Numeric and Categorical Data With Cuckoo Search," in IEEE Access, vol. 8, pp. 30988-31003, 2020, doi: 10.1109/ACCESS.2020.2973216. https://ieeexplore.ieee.org/document/8993805 2. Albalawi H., Pang W., Coghill G.M. (2020) Swarm Inspired Approaches for K-prototypes Clustering, Advances in Computational Intelligence Systems. UKCI 2019. Advances in Intelligent Systems and Computing, vol 1043. Springer, Cham. https://doi.org/10.1007/978-3-030-29933-0_17 https://link.springer.com/chapter/10.1007%2F978-3-030-29933-0_17

Machine Learning for Understanding Evolution of Topics and Public Attentions
Wei Pang
Edinburgh
We try to understand topics and public concerns over time from news media or other related documents (such as patent data) and gain insights into the evolution of topics and public attention on the circular economy or green innovations. So we need to use machine learning models (e.g. dynamic topic modeling and dynamic clustering) to understand how the public's opinions are changing over time. Specifically, this is related to an EPSRC project DCEE (https://dcee.org.uk/) with Imperial and Loughborough (https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/V042432/1). One task is to understand public perception of the electrochemical circular economy through social media or large amounts of online texts. For example, what are people's views and concerns on sustainable chemical products?

Immune-inspired Algorithm for robust and secure machine learning
machine learning deep learning artificial immune systems robust ml safe ml
Wei Pang
Edinburgh
Have you heard of the one-pixel attack? https://arxiv.org/abs/1710.08864 It can fool advanced deep learning algorithms by only changing only one pixel of an image. Recently Sparse attack is becoming more and more popular: https://openaccess.thecvf.com/content/CVPR2023/papers/Williams_Black-Box_Sparse_Adversarial_Attack_via_Multi-Objective_Optimisation_CVPR_2023_paper.pdf In this project, we will use immune-inspired algorithms to develop attacking methods in order to fool existing deep learning algorithms, which can raise awareness of security and safety for machine learning. We wil aslo develop protection methods to protect machine learning algorithms from potential attacks, and this will also be inspired by immune systems or evolutionary algorithms.

Swarm Intelligence and Its applications
Wei Pang
Edinburgh
This is an open research question, and you can use swarm to solve any problems that you are interested.

Graph Diffusion Models
graph neural networks diffusion models molecular property prediction
Wei Pang
Edinburgh
This project will investigate cutting-edge diffusion models on graphs, and its applications on molecular property prediction, graph classification etc.

Study of co-existence and handover issues for cellular-connected drones
Andres Barajas Paz
Dubai
The goal of this project is to study, understand and validate the research issues of cellular-connected drones with respect to handover and interference when supported from the existing terrestrial cellular infrastructure. The students will be involved in reading research papers, analyze mathematical models of interference for drones, estimate handover criterias, use existing real-time frameworks to validate the research issues.

Attitudes towards artificial intelligence in society
Ron Petrick
Edinburgh
Investigate current attitudes towards artificial intelligence (AI) in different parts of society, including the general public, media, and AI practitioners. This project will involve a survey of recent research literature on AI trends as well as the portrayal of AI in the popular media. Building on recent studies in other parts of the world, interviews will be conducted with members of the public to understand concerns about recent trends towards wider deployment of AI. This project may also involve the creation of small AI artefacts (e.g., such as chatbots) to help guide the study.

Combined machine learning and automated planning
Ron Petrick
Edinburgh
This project will explore the use of modern machine learning techniques (e.g., deep learning, reinforcement learning, etc.) for different problems in automated planning. While automated planners are good at making goal-directed plans of action under many challenging conditions, the addition of machine learning tools to the process could lead to optimisations in terms of more efficient planning or higher quality plans. Also, some symbolic aspects of the planning problem (e.g., action specification in PDDL) could be learnt by using machine learning techniques. Applications of this task will be applied in planning scenarios such as robot control or human-machine interaction.

Large language models for automated planning
Ron Petrick
Edinburgh
This project will explore the use of large language models (LLMs, e.g., ChatGPT) for different problems in automated planning. While automated planners are good at making goal-directed plans of action under many challenging conditions, the addition of large language models could help automate parts of the planning process in terms of more efficient planning or higher quality plans by providing commonsense guidance. Some symbolic aspects of the planning problem (e.g., action specification in PDDL) could be built using such techniques. Applications of this task will be applied in planning scenarios such as robot control or human-machine interaction.

Creativity in artificial intelligence
Ron Petrick
Edinburgh
This project will explore the state of the art in generating creative content (e.g., artwork, poetry, prose, music, etc.) using AI techniques. The project will choose a subset of mediums and attempt to generate new creative content by implementing generative AI techniques (e.g., machine learning). An evaluation will be performed to compare AI-generated content against traditional human-generated content.

Plan-based artificial intelligence and games
Ron Petrick
Edinburgh
This project will explore the use of artificial intelligence techniques in games, with a particular focus on automated planning and related approaches. The student will survey the state-of-the-art in some aspect of AI game playing and focus on applying new AI algorithms to a game environment. Such techniques could be used to control artificial game players or automate particular aspects of the game environment (e.g., board layout, puzzle creation, move suggestion, etc.). This project will involve a significant amount of software development for AI techniques and the game environment.

Plan-based explainability in artificial intelligence systems
Ron Petrick
Edinburgh
This project will explore the problem of explainability in AI systems (XAI) using automated planning tools (XAIP). Automated planners provide a causal model of states, actions, and plans which will serve as the underlying framework for explaining agent behaviour in particular circumstances. New approaches to plan explainability will be explored and implemented using existing planners that may be augmented and tested using representative planning domains.

Human-in-the-loop automated planning
Ron Petrick
Edinburgh
Automated planning systems are good at generating goal-based plans of action but typically require a fixed model. Human-in-the-loop planning enables a user to introduce constraints into the planning process which affects the plans that are generated. This project will build a suitable interface for an off-the-shelf automated planner to enable certain types of constraints to be specified by the user (e.g., preferences, temporal, goal ordering constraints) and control the planning/replanning process.

Simulation and visualisation environments for planning
Ron Petrick
Edinburgh
This project will explore the problem of producing or extending simulation or visualisation environments for automated planning. Automated planners provide a causal model of states, actions, and plans which will serve as the underlying framework for the environment to be simulated/visualised. This project may explore new approaches to plan simulation/visualisation or extend existing systems (e.g., PDSim). Testing will be done using representative planning domains.

A Security Framework for BitTorrent
Hani Ragab
Dubai
BitTorrent is the de-facto peer-to-peer (P2P) standard. Unfortunately, its lack of security has led to it being mainly used for illegally exchanging gray content (e.g. copyrighted materiel). The objective of this project is to add authentication, authorisation, confidentiality and data integrity to BitTorrent. MonoTorrent, which has a much simpler source code than Vuze, will be used for implementation.

AI for Video Games
Hani Ragab
Dubai
To be discussed face-to-face. You can get inspired by reading Saarah's dissertation, which was ranked second at an international British Computer Society competition http://www.macs.hw.ac.uk/cs/project-system/projectdata/archive/2019/ugcse/sw36_full_text.pdf

AI-based Dominoes Player
Hani Ragab
Dubai
The project covers a few topics, including: - Possible use of deep reinforcement learning to control the player - Potential for covering the interface with the real world by, e.g., doing image acquisition of what a player has in their hands + dominoes that were played to feed them into the AI.

Anti-Malware Software
Hani Ragab
Dubai
Existing anti-malware products struggle to keep up with the hundreds of thousand of malware that appear on daily basis. Previous research in our University have obtained interesting results by applying machines learning techniques. In this project, you design and build a prototype of an anti-malware that uses those results. The anti-malware can be built by either writing it from scratch, or by adding rules/signatures into an existing anti-malware (e.g. ClamAV). Arjun Rajeev did implement a first functional version of this software. You will add features and possibly re-implement some.

Attendance Monitoring System
Hani Ragab
Dubai
The objective of this project is to provide an easy-to-deploy, secure and scalable student attendance monitoring system. The new system should be able to automatically collect student attendance information (e.g. by using NFC, fingerprint readers, or face recognition) without an (major) intervention from the lecturer. The lecturer will still have a fail-over interface where they can enter/edit attendance information. The system will be able to generate automatic attendance notification emails to the school administration office (e.g. when a student misses 3 lectures in a raw).

Blockchain and Bitcoin Applications
Hani Ragab
Dubai
Blockchain can be defined as an online distributed ledger, with the possibility to store virtually any information on it. Cryptocurrencies, such as Bitcoin, are based on blockchains. We aim to investigate the use of blockchains to different sectors of activity, e.g - Medical records and other health-related applications. - Real-estate property management (and similar applications, such as car property management) - Data Science students might want to investigate available datasets about blockchain and the different cryptocurrencies (e.g. Bitcoin) - Surprise me !

Building Ethical Hacking Tools (with AI?)
Hani Ragab
Dubai
The ethical hacking tool could perform one or more attack types, including attacks on TCP/IP and common networking protocols, websites, Windows, Android, Ubuntu and any other software/system. It also could use any technique, such as (D)DoS, injection (e.g., SQL), overflow (e.g., buffer and heap), etc. The tool could be for any purpose related to ethical hacking, including reconnaissance (e.g., identifying open ports and available services), deceiving end-users for all sorts of social engineering, building exploits and using them, maintaining access on a system (e.g., through backdoors, remote admin tools, steganography). It would be interesting to use AI to automate or improve the hacking tool in general. For example, an AI system could be used to automatically find vulnerabilities, generate obfuscated malware, or determine which attack to carry out depending on the target. I am also interested in upgrading existing tools with new capabilities; most of them are open source and available on github. Potential challenges (depend on the attack and target): parallelisation of the attack (e.g., port enumeration), training machine learning models, identifying vulnerabilities. Note: - Multiple students could work on different hacking tools in parallel. - Programming languages: C, Python, Ruby, assembly, ... (but not Java!) - The project's exact level on difficulty will depend on the agreed aim and objectives.

Data Synchronisation using BitTorrent
Hani Ragab
Dubai
The objective of this project is produce a file synchronisation system based on BitTorrent. The system will provide functionalities for saving and retrieving files as well as synchronising them across the devices of files owners.

Design and Implementation of an Authorisation (Access Control) System
Hani Ragab
Dubai
There are two types of certificates: X.509 and PGP. Public key certificates are used to identify users. Privilege certificates, on the other hand, are used to define access rights are privileges. X.509 PMI is an example implementation of privilege certificates. Privilege certificates can reliably be used in auhtorisation systems to grant/deny access to resources. Permis (https://en.wikipedia.org/wiki/PERMIS) is an example of authorisation system. Our objective is create an authorisation system. Implementation: Programming language: C, C++ or Python OS: Linux (Preferably CentOS), Unix. F21CN Computer Network Security (or equivalent course) is a pre-requisite for MSc students and co-requisite for Hons students.

Exoplanet Detection
Hani Ragab
Dubai
This project builds on previous projects I supervised to use several ML techniques (mostly deep learning) to detection exoplanets.

Exoplanet detection using machine learning
Hani Ragab
Dubai
Exoplanets are planets that orbit other stars. 3000+ exoplanets have been detected so far, and there are more to come! The most successful exoplanet detection technique so far (with 2000+ detected planets) is transit shape detection. This technique looks for drops in light intensity of a star when its planet comes between the star and earth observers. The challenge here is that there are 150K+ stars being observed and this is too many to be processed by humans. Machine learning can be applied to efficiently detect such drop in stars brightness without human intervention. This project builds on a previous MSc project done under my supervision in 2016/17.

Fake News Detection
Hani Ragab
Dubai
To be discussed in a face-to-face meeting.

Machine Learning API
Hani Ragab
Dubai
Several machine learning algorithms exist and can be used to, e.g. predict the value of an output based on inputs. The input is a matrix. Sparse matrices are matrices whose elements are mostly zeroes. Not taking that fact into account results in sub-optimal manipulation of the matrix and a waste of CPU time, RAM and storage. Our objective is to: - Build a library that implements one or more feature selection mechanisms - Implement one or more machine learning algorithms - (Optionally) parallelise computations. - (Optionally) integrate our library in R. Programming Language: - C, C++ - (Optionally) Assembly - Certainly not Java :) Some Wikipedia Reading: - Sparse matrices: https://en.wikipedia.org/wiki/Sparse_matrix - Feature selection: https://en.wikipedia.org/wiki/Feature_selection - Machine learning: https://en.wikipedia.org/wiki/Machine_learning The project can be taken a group of students where each of them will be working on a particular component of the API.

Machine Learning for Android Malware Analysis
Hani Ragab
Dubai
This project will review existing works for android malware detection. It will then investigate how to apply machine learning techniques to it. This will include identifying possible features that can characterise malware (e.g. subsets of binary code) then applying suitable techniques to them. The following books are in the library, you might want to have a look at them beforehand: - https://www.nostarch.com/malware, - https://www.nostarch.com/androidsecurity, - https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition

Machine Learning for Linux Malware Detection
Hani Ragab
Dubai
This project will review existing works for Linux malware detection. It will then investigate how to apply machine learning techniques to it. This will include identifying possible features that can characterise malware (e.g. subsets of binary code) then applying suitable techniques to them. The following books are in the library, you might want to have a look at them beforehand: - https://www.nostarch.com/malware, - https://www.packtpub.com/networking-and-servers/learning-linux-binary-analysis, - https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition

Machine Learning for Windows Malware Analysis
Hani Ragab
Dubai
This project will review several existing machine learning-based malware analysis and critically appraise them. A comparative study would allow to draw conclusions on the suitability of the different techniques for malware analysis. The following books are in the library, you might want to have a look at them beforehand: - https://www.nostarch.com/malware, - https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition

Machine Learning/Data Mining Applications
Hani Ragab
Dubai
The project will be about a dataset(s) of your choice (e.g. from your work, kaggle.com) with an objective to uncover hidden patterns in it. Specific details to be discussed.

Pandemic Data Analysis
Hani Ragab
Dubai
Details to be discussed.

Reputation Systems
Hani Ragab
Dubai
Reputation systems are used to assign, share and maintain a score to entities in a system. Those entities can be files or users for example. Reputation systems are common place, and can be found on commercial websites, such as souq.com or eBay; and in distributed systems such as BitTorrent. In the case of BitTorrent (or other P2P networks), scores can be assigned to either 1- Users: e.g. a score of 0 means, for example, the user shares malicious content (for security applications), or is a free-rider, i.e. someone who has a selfish behaviour (for QoS) 2- Files: e.g. a score of 0 means, for example, it contains malware (for security applications), or has a misleading title (for QoS). We have several published papers in the field and did supervise several projects on this topic that can be shared with interested students on demand.

Self-driving Drones/Cars
Hani Ragab
Dubai
Will provide required hardware. Details of the project to be discussed in a face-to-face meeting

Smart Homes
Hani Ragab
Dubai
Several topics can be done here. Just as an example, enable mobility of users, both at intra-house and inter-house levels. The location of the user inside the house would allow the system to adjust the house parameters to the user. For example, when the user moves from room 1 to room 2, i.e. intra-house movement, the music in room 1 is switched off, and it is switched on in room 1. When the user visits their neighbours, i.e. inter-house movement, non-confidential parts of their profile can be automatically uploaded to their destination to personalise environment parameters, e.g. A/C temperature (if the host policy allows it).

The next communication tool for Academia
Hani Ragab
Dubai
While existing communications tools such as Microsoft tools, Collaborate and Zoom can be used to carry-out Academic-related tasks such as teaching, supervision meetings and academic boards, they all present some limitations and/or constraints. These include price & license limitations, privacy issues and inability to personalise to local context. This project aims to create, from FOSS, an alternative to the above tools that can be used by academic institutions. Furthermore, Covid-19 has pushed several academic institutions, each with its IT infrastructure, to make a sudden move to online delivery and were struggling to find an adequate tool or at least a product they can customize by editing it. We will provide such tool. In addition to the obvious ability to establish audio and video calls, interesting features, that might require more than one student project, include: - Ability to share documents, whiteboard, entire screen with participants - Integration in organisation, e.g., active directory, Learning Management System (LMS) - Role management: presenters, attendees, moderators, breakout-manager, etc. - Creation and management of breakout rooms. - Integration in (open) calendar software. - Possibility for individual organizations to host their own instances of our software - Surprise me! By doing this project, you agree to the following: The software is to be released under the GNU Affero General Public License (https://www.gnu.org/licenses/agpl-3.0.html). Changes to the choice of the license must be agreed by Dr. Hani Ragab Hassen.

[New] Cybersecurity in Robotics
Hani Ragab
Dubai
There are several cybersecurity challenges in robotics and we can work on one of them: Secure Communication: communication could be either with other robots (e.g., in a swarm scenario) or with controllers or base stations; compromising those communications can lead to disasters. Intrusion Detection Systems (IDS): since the traditional IDS are made for standard TCP/IP networks, expecting a different type of traffic. Developing suitable IDS using AI would add an excellent layer of protection. Data Privacy and Ethics: robots can get access to sensitive data, especially those used in medical context or personal assistance. It is crucial that this data is only accessible by authorised individuals and according to a pre-defined policy. AI and Machine Learning Security: many robots are controlled (partially or entirely) by AI systems. It is important to protect those systems against adversarial attacks that could manipulate the AI controller of a robot.

Socially assistive robots - digitalising cognitive tests
Marta Romeo
Edinburgh
Socially Assistive Robots (SARs) represent a promising technology to provide assistance outside of clinical and controlled environments. In particular, their deployment in the house of prospective users can provide effective opportunities for early detection of Mild Cognitive Impairment, a condition of increasing impact in our ageing society, by means of digitalised cognitive tests. The aim of this project is to turn classical paper and pencil tests into SAR-based applications. The final outcome will be a ROS-based application that integrates a speech module, a graphical user interface and a dialogue manager.

Wizard-of-Oz for robotic experiments
Marta Romeo
Edinburgh
In most of human-robot interaction experiments the robot is not fully autonomous. In these cases, the investigator becomes the puppeteer, or Wizard, and controls the robot without letting the participants of the experiment realise that the robot is merely remote controlled. The wizard can control all the functionalities of the robot or take over from the robot only in specific circumstances. This is usually done for safety reasons or because, when the variables under observation are all human-dependent, it is much more important that the experiment carries on without any interruption caused by a robotic malfunctioning. To be able to remote control the robot the wizard needs to have a full picture of what is happening in the experiment room in terms of flow of interaction between the robot and the participant, inner state of the robot, possible next action to take etc. The aim of the project will be to create a modular wizard interface, possibly integrating ROS, that could be easily used under different experimental conditions.

Projects on human-robot interaction, ethics and trust
Marta Romeo
Edinburgh
Contact me if you want to discuss a possible project on human-robot-interaction; trust in human-robot rapport; ethical implications of robotics and AI applications.

The role of expectations in the face of robots’ failures
Marta Romeo
Edinburgh
Adoption of robotics solution is still a challenging problem. This is driven by the fact that the general public is exposed to robots and their applications mainly through the media, that often times exaggerate the capabilities of the platforms. This is even more true with social robots, that are expected to be our companions and everyday helpers. What happens when robots fail? Most of the users get discouraged and dismiss the technology. This project will look at how the perceived gravity of the robot failure changes with respect to the level of knowledge about the robot’s capabilities. This will be tested with a user study. Within the project you will have to: 1) define the different levels of information to give to participants before the study begins; 2) develop a task for a social robot (platform to be decided as a part of the study design) to complete together with the participants; 2) program the robot to fail within the task 3) collect and analyse the data on participants perception of the robot following the interaction.

Trust modelling for human-robot interaction
Marta Romeo
Edinburgh
Trust is essential for successful human-human interactions and plays a major role in human-robot interactions, influencing the human’s willingness to accept information from a robot and to cooperate with it. Modelling trust could therefore be a vehicle to build more intelligent social robots. For this reason, much work has been done in trying to identify the factors defining trust and a computational model that could encapsulate the concept. Bayesian models and reinforcement learning have shown promises in this respect. Although trust evolves as the interaction evolves, in many works time is not fully taken into consideration. The objective of this project will be to dive into the literature on trust modelling to develop and test (through simulations) a cognitive architecture on the evolution of trust in human-robot interaction.

Anything contributing to a free/open source software project
open source free software foss
Adam Sampson
Edinburgh
I have 20+ years experience in running and contributing to free/open source software projects. If you have an idea for a project in any area that will make a substantial code contribution to an existing FOSS project, I'd potentially be interested in supervising it - I can advise on FOSS licensing, working with a FOSS community, tools and techniques normally used in FOSS development, and so on. (I'm not interested in projects related to generative AI or machine learning - please speak to other supervisors for those.)

Anything related to analogue video decoding
video signal processing dsp open source history
Adam Sampson
Edinburgh
I work on the ld-decode FOSS project, which captures high-quality digital video from analogue video sources such as LaserDisc and videotape. The ld-chroma-decoder tool is part of this: it extracts colour information from the video, converting the original "composite" signal into an RGB representation that can be shown on modern displays. This is a complex and computationally-expensive signal processing task. There are various ways ld-chroma-decoder could be improved: you could speed it up by making parts of it run on the GPU, or you could improve the PAL decoder by training it more effectively, or you could add support for the French SECAM video standard, or or or... there are more ideas on the ld-decode Wiki. I'd be interested in supervising anything related to this project; if working with analogue video sounds interesting to you then please talk to me.

Anything related to interactive fiction games
games interactive fiction modelling
Adam Sampson
Edinburgh
Interactive fiction is one of the oldest genres of computer game. These days it includes text and graphic adventures, hypertext games, visual novel and other forms of game based on storytelling within a modelled world. You can build games from scratch in this genre (you might even have built a text adventure as a beginner's programming task at some point), but most creators use game engines such as Inform, Twine and RenPy. I'm interested in supervising projects that want to build or analyse this kind of game, or to work on tools for creators to use, or to resurrect classic games through emulation or reimplementation. Please talk to me if you'd like to work in this space.

Automatic RAID bit flip correction
linux kernel raid storage
Adam Sampson
Edinburgh
Modern hard disks and SSDs corrupt data at a fairly predictable rate. Integrity-checking filesystems and software RAID schemes protect against this by using various checksumming approaches to detect corrupt blocks; if a corrupt block is detected, it must be obtained from another copy of the data. However, since the most common type of corruption is a single bit being flipped, it should also be possible to try to repair a corrupt block by flipping individual bits and seeing whether the checksum is correct. Implement and evaluate this scheme inside Linux software RAID or btrfs.

Better folk music metadata for MusicBrainz
music metadata audio analysis
Adam Sampson
Edinburgh
If you don't like folk music, this is not the project for you. MusicBrainz is an openly-licensed public database of metadata about recoirded music - it's used by the BBC, for example, to provide various kinds of information about music on their web site. I'd like it to have better metadata for traditional/folk music - for example, cross-referencing songs to catalogues such as the Roud catalogue (songs) or thesession.org (tunes). Some of this information has been added manually, but you should be able to identify likely recordings of particular pieces based on their names and performers... and perhaps based on audio fingerprinting? (I am not interested in generative AI approaches to this project, but data mining may be worth investigating.)

Certificate-encoding names for TLS web sites
tls security cryptography
Adam Sampson
Edinburgh
"Ugly names" for TLS web sites. As an alternative to traditional CA infrastructure, encode cryptographic identifiers in DNS names as a mechanism for verifying certificates. This is how Tor hidden services work already - you end up with a long, awkward name, but you are no longer dependent on a fragile, expensive (and often corrupt/fraudulent) certificate authority. Implement this in OpenSSL or Firefox. This is a complex and technically challenging project, and you shouldn't choose it unless you've got some understanding of cryptography already.

Customisable pointer decorator syntax for C++
c++ syntax compiler
Adam Sampson
Edinburgh
Programming in modern C++ is made considerably safer by the existence of smart pointer classes such as std::shared_ptr; in most cases, these can be used as drop-in replacements for C-style pointers, avoiding many of the security and correctness faults common with pointer use. However, these are library features - there's no affordance in the language to make using them more convenient. Add support to a C++ compiler for a more convenient syntax for smart pointers, like the existing syntaxes for C-style pointers (*) and C++ references (&). For example, you might allow a shared_ptr argument to be written as "Foo% ptr". You'd need to design an appropriate syntax, modify a compiler to understand it, and evaluate whether it measurably simplifies real-world code. The "cpp2" syntax (https://hsutter.github.io/cppfront/) would be worth looking at for ideas, but the idea here is to maintain compatibility with existing C++ programs rather than designing a new, incompatible syntax.

Deterministic filesystem/archive format
filesystem security forensics
Adam Sampson
Edinburgh
In a typical filesystem, the contents of the disk depends not just on the files being stored, but other factors such as the order they were written in, previously-deleted files, the size of the disk, and so on. This makes forensic analysis of disk images possible - you can extract deleted files, or tell information about how the filesystem was built. Instead, I'm proposing that for any given collection of files, there should be exactly one valid representation of them on disk - guaranteeing that no information is being accidentally leaked. You would need to design the filesystem layout and build a tool to construct and verify the filesystem. Ideally you would then build a Linux kernel filesystem to read (and maybe modify) it. The real challenge comes in making it efficient to update later on...

Encrypted Git storage
git version control encryption security
Adam Sampson
Edinburgh
The Git version control system is widely used, and has been extended over the years to serve various purposes - for example, it's possible to cryptographically sign a commit. It would be useful to be able to encrypt some files within a repository - e.g. if you have files containing secret keys within a project that only some contributors should have access to. You could draw ideas for this from the Git-LFS large file extension, and from encryption extensions in Linux filesystems.

Family tree rendering
genealogy family tree constraints graphics
Adam Sampson
Edinburgh
Given a genealogical database from a system like GRAMPS, generate a high-quality vector-graphics family tree rendering - not just ancestors or descendants of a single person, but using a constraints system to represent as much of the tree as possible in a single rendering. This is something I've prototyped before and have some ideas about, but needs redoing properly using modern graphics technologies. I can provide sample data.

Fix camera faults in analogue video
video signal processing history
Adam Sampson
Edinburgh
I work on the ld-decode FOSS project, which produces high-quality digital captures of analogue video sources (such as LaserDisc or videotape). It's fairly common to find examples of video where faults in the original source are visible - for example, the red/green/blue sensors in the camera are misaligned, or the image is disturbed by loud noises near the camera (microphony), or the original video was played back on a misaligned video recorder so lines are offset. These show up in commercially-released video as well. Given an understanding of the structure of the video signal, it should be possible to correct for these kinds of faults in software to substantially improve picture quality.

GPU malware
security malware gpu
Adam Sampson
Edinburgh
The modern GPU is a high-powered general-purpose computer system, with large quantities of memory and the ability to access parts of the CPU memory space. On APU systems, it can potentially access hardware devices too. Investigate what malicious software running on the GPU might be capable of. I'd be particularly interested in the implications for APU systems such as AMD Ryzen - can you do stealth network communication from the GPU, for example?

GPU operating system
gpu operating system kernel security
Adam Sampson
Edinburgh
A modern GPU is a highly capable, multicore, general-purpose computer system, that just happens to be particularly good at vector arithmetic. But they're generally used for graphics or for offloading maths-intensive tasks from the main CPU. What would a proper operating system designed to take advantage of a GPU's architecture look like? There's plenty of existing work in operating systems consisting of communicating parallel tasks - you could look at microkernel systems like Minix, or OSs for loosely-coupled parallel architectures like HeliOS. See what you can do to enable efficient, secure (if possible!) general-purpose computing on the GPU. This is a complex project that will require low-level understanding of GPU architecture and some experience of operating system programming.

Identify reused elements in folk tunes
music audio analysis music theory
Adam Sampson
Edinburgh
If you don't like folk music, this is probably not the project for you. Thousands of folk tunes (short instrumental pieces), from many different traditions, are available in easily machine-readable ABC format in online databases like thesession.org and folktunefinder.com. As folk tune authors tend to "borrow" elements of existing tunes, it should be possible to take a collection like this and identify common elements between tunes - for example, showing how a tune has moved between different traditions (e.g. Scotland/Ireland/US) and been adapted to different instruments, or how different versions of the same tune have diverged over time. You could potentially use this to build a tool for exploring a collection of tunes, by showing links between tunes that share elements of melody.

Implement the game Dazzle Dart
games history retrocomputing
Adam Sampson
Edinburgh
Harold Abelson's Dazzle Dart (https://dl.acm.org/doi/10.1145/1216479.1216482) was one of the earliest multiplayer video games, created at MIT in the early 1970s. It's well overdue for a remake using a modern engine! The original was constrained by the hardware it ran on and had a very abstract 2D display; you'd need to think about whether to adapt it to 3D and how to take advantage of modern controls. (A previous student did a basic 3D version using Unity, which worked pretty well, but I'd like to see a more polished version without the dependency on a proprietary game engine.)

Introduce randomness into kernel compilation
linux security aslr kernel
Adam Sampson
Edinburgh
The Linux kernel, like Linux userspace, takes advantage of address space layout randomisation (ASLR) to make it harder for an attacker to predict memory addresses within the kernel. But we could go further than this with some help from the compiler - you could also randomise the layout of the stack frame, the layout of structs in the kernel, and so on. This would mean compiling a new kernel each time you upgrade the kernel (or even each time you reboot), but that may be a price worth paying - and Fabrice Bellard's tccboot project showed that this can be done with relatively low overhead.

Linux kernel NFS over TLS or NoiseSocket
nfs filesystem linux kernel security cryptography tls noise
Adam Sampson
Edinburgh
NFS is the standard network filesystem on Unix-like systems. Traditionally it's unencrypted, relying on the security of the network; it can be run over Kerberos, but that's complex, difficult to set up in small networks, and does not support modern cryptography. The Linux kernel now has good built-in support for TLS and other modern cryptographic primitives; in particular, the Wireguard VPN system uses a protocol based on the Noise framework. In this project, you would add support to Linux for running NFS over a TLS or NoiseSocket transport, making it easy to set up secure network filesystems.

Make sudo less awful
linux security open source
Adam Sampson
Edinburgh
The sudo tool is sadly nearly ubiquitous on modern Linux systems - sadly, because it has a long and inglorious history of appallingly bad security holes, through being written in C and doing a complex, security-critical job. Find ways to improve this! You might look at re-engineering it in a more secure language (or language subset), or redesigning it to take advantage of privilege separation or operating system sandboxing, or...?

Make X work for high-DPI displays
x graphics linux unix
Adam Sampson
Edinburgh
The X11 graphics system has been widely used on Unix-like operating systems since the 1980s. It was originally designed to be resolution-independent, supporting high-DPI output devices such as printers in addition to regular displays. However, if you try using it on a high-DPI display these days, you will find that some of the libraries and server behaviour make assumptions about display resolution that are not appropriate for a modern 200+ DPI display (e.g. requiring low-resolution fonts or not computing spacing correctly in GUI layouts). As a result, people resort to ugly, inefficient Windows-style hacks such as pixel scaling - rather than using the display at its native resolution. Fix this - disable pixel scaling, configure X to run at the native DPI of a modern 4k display, try a range of applications and work out what's broken.

Model-check Linux's BPF verifier
model checking bpf linux kernel security
Adam Sampson
Edinburgh
BPF is a virtual machine architecture that is used for various "programmability" tasks inside the Linux kernel - for example, you can use it to specify custom firewall rules or custom scheduling conditions. It's important that BPF is *not* a general-purpose architecture, since BPF programs must execute within a fixed amount of time and resources - the BPF verifier is responsible for checking BPF programs to make sure they meet these rules. Since the BPF verifier is just a bit of code written in C, it's had several bugs where harmful BPF programs are incorrectly validated. This seems like an ideal application for some formal reasoning - can you come up with a way of making the BPF verifier itself verifiably safe, so you can prove that it can't validate an unsafe program? I'm imagining using model-checking techniques for this, but I'm sure there are other ways you could attack this problem as well.

Physically modelled classic drum machines
audio synthesis signal processing dsp music watch ya bass bins
Adam Sampson
Edinburgh
A musical project - 1980s drum machines like the Roland TR-808 and TR-909 are still widely used and cloned today, both in hardware and software. Software implementations are often based on samples, though, rather than on a physical simulation of the circuitry - making for less variation and flexibility. In this project, you'd use an existing FOSS electronics simulation system to build a model of a drum machine (or some parts of it), and wrap it in a LV2 software synthesiser so it could be played within a digital audio workstation such as Ardour.

Programming language based on Wadler's CP calculus
language design process calculus cp concurrency
Adam Sampson
Edinburgh
A process calculus is a mathematical model of the behaviour of a concurrent program. Designing a programming language's facilities to correspond to a particular process calculus is interesting because it allows you to reason mathematically about the behaviour of programs written in that language - for example, proving that they can't deadlock or livelock. The Communication Sequential Processes calculus has been particularly successful, with languages like occam, Go and Rust using it as the basis of their concurrency facilities. But CSP dates from the 1970s, and there have been advances in process calculi since then! Philip Wadler's Classical Processes (CP) calculus is a particularly interesting example - it makes use of ideas from the theory of session types, which has traditionally been used to reason about the safety of things like network protocols and cryptographic procedures. It would be interesting to experiment with designing and implementating a simple programming language based on CP, in the same way occam is based on CSP.

Ransomware-resistant filesystem
filesystem linux kernel security
Adam Sampson
Edinburgh
Ransomware-resistant filesystem or storage device. Revisit the ideas behind log-structured filesystems in order to maintain the filesystem so that it can always be rolled back to previous states. This could even be done at the physical device level (e.g. build a device that filters SATA commands), so you can't actually destroy anything permanently without physical intervention. Implement this, either within the Linux kernel, as a FUSE userspace filesystem, or as a prototype in userspace. (A previous student did the last of these, so I'd rather see a working implementation.)

Retrocomputing support for Radare
reverse engineering security cpu history
Adam Sampson
Edinburgh
Radare is a suite of tools for reverse-engineering software - for example, automatically extracting a structured disassembly from a binary. It supports a range of modern architectures, but it would also be useful to apply it to code used on historical computer architectures - for example, when understanding the code as part of a computing history project, or when porting it to a new platform. I'd be particularly interested in support for the Motorola 68000 architecture (a 16/32-bit architecture widely used in the 1980s) and DEC PDP-10 architecture (a 36-bit architecture commonly used in the 1970s), but other architecture - e.g. various IBM mainframes - would also be interesting.

Secure video conferencing support to Jamulus
audio real-time conferencing music security privacy
Adam Sampson
Edinburgh
Jamulus is a FOSS system that allows musicians (like me) to play together in real time over the Internet. It has good support for high-quality audio, but it doesn't support video, so many groups that meet on Jamulus also have to use a separate video conferencing system such as Zoom or Jitsi to see each other. Add support for simple video conferencing to Jamulus. Since the Jamulus protocol is highly latency-sensitive, I suspect this would be best done by integrating a separate video-conferencing protocol into the Jamulus client (ideally an existing one for interoperability). There are some pretty substantial privacy concerns around this so it would make an interesting project in terms of security usability engineering.

Security extensions for RISC-V
risc-v security cpu architecture cheri
Adam Sampson
Edinburgh
RISC-V is a modern RISC architecture based on open-source principles; there's a core of instructions and a collection of extensions that provide additional facilities (e.g. vector maths). There are existing high-quality toolchains and software emulators for it - you don't need RISC-V hardware to work with it. Projects like CHERI have experimented with extending existing computer architectures to provide better security facilities - CHERI adds pointer bounds to the ARMv8 architecture. Design an extension to RISC-V to provide similar facilities, or to improve software security in other ways (e.g. bounds checking, pointer authentication, untrusted data tracking...). Implement this in an emulator to demonstrate that it can be used to detect errors in programs. (A previous student experimented with pointer authentication successfully, so maybe try a different approach.)

TV Studio Simulator game
games history tv broadcasting social history
Adam Sampson
Edinburgh
There are plenty of silly "XYZ simulator" games out there - how about one that simulates the staff of a busy 1960s/1970s TV studio? You could play as a camera operator, vision mixer, producer, boom operator, etc., or play together in multiplayer mode as a team of people trying to make a complex drama or news show work. Disasters would include broken equipment, misbehaving actors, unreasonable time pressure and invasions by visiting schoolchildren. For inspiration, have a look at the ADAPT project (https://www.adapttvhistory.org.uk/) or the stories on the BBC Tech Ops site (http://www.tech-ops.co.uk/next/).

Use fuzzing to automatically test narrative games
fuzzing testing games interactive fiction
Adam Sampson
Edinburgh
Coverage-directed fuzzing is a highly effective technique for testing software - it combines random input with feedback from software coverage measurement to generate input that explores all the possible paths of execution through a piece of software. Apply this technique to a story-based game - I was thinking interactive fiction games such as those written in Inform, but you could also do it with other kinds of games - to ensure that all the possible routes through a game world are explored automatically during testing. A variant of this would be to take an arbitrary interactive fiction game (there are plenty of these in IFDB) and attempt to automatically generate a walkthrough for it.

Use fuzzing to identify faults in emulators
fuzzing testing emulation cpu security
Adam Sampson
Edinburgh
Coverage-directed fuzzing is a highly effective technique for testing software - it combines random input with feedback from software coverage measurement to generate input that explores all the possible paths of execution through a piece of software. An emulator such as qemu, simh or MAME executes software written for a different architecture by simulating the CPU and peripherals in software. Faults in emulation are common - either producing incorrect results, or worse, producing security holes. However, if you have two emulators for a given architecture - or an emulator and a real CPU - then you could detect faults by using fuzzing to generate code, running it on both, and comparing the results; if they don't match, or the emulator crashes, you've found a problem. I would suggest picking a simple, common architecture with lots of different emulators available (Z80, 6502...) to maximise the chance of finding an interesting problem. (A previous student had a good attempt at this with a custom emulator, so I'd like the focus to be on analysing faults in existing emulators.)

X or Wayland server in a safe language
graphics linux unix x wayland security
Adam Sampson
Edinburgh
The X graphics system is widely used on Unix-like systems; its successor, Wayland, is starting to come into wide usage. Both of these have existing good-quality implementations that are written in C, and thus suffer from the usual security problems of unsafe languages. Implement a new X or Wayland server using a modern, safe programming language such as Rust, Go, Nim, Haskell or OCaml (I'm not interested in doing this with Java or C#). Alternatively, take one of the existing implementations and find a way of making it safe - for example by adding annotations to the C code to allow better safety analysis.

Anything related to emulation or the history of computing
history emulation retrocomputing games
Adam Sampson
Edinburgh
I work on some open source projects that aim to preserve the history of computing. For example, I've written software to rescue data from failing floppy disks, and to reconstruct historical operating systems from the 1970s and early games engines. Previous student projects in this area have included emulation of 1980s processors and a software-hardware setup to explore the behaviour of historical microchips. I'd be interested in supervising any projects in this area.

Port a classic operating system to a modern platform
operating system portability history arm risc-v
Adam Sampson
Edinburgh
There are a range of older operating systems that have been released under FOSS licenses, including: - EmuTOS and MiNT (originally 68000) - https://emutos.sourceforge.io/ and https://freemint.github.io/ - RISC OS (originally ARM) - https://www.riscosopen.org/ - Coherent (originally x86 and others) - https://gunkies.org/wiki/Coherent As these were intended for use on computers of the 1980s, with processors running at a few MHz and at most a few MiB of memory, they would be a good fit in terms of resources for modern middle-spec microcontrollers. Take one of these systems and make it run on, say, an embedded RISC-V chip. (EmuTOS would be the simplest; MiNT the most capable; both written in C and compilable with modern toolchains. RISC OS is mostly in ARM assembler so porting some of it to ARMv8 would make for an interesting project. Coherent is 1980s-style C but was intended to be portable originally.)

Anything related to audio or music
music audio music theory signal processing
Adam Sampson
Edinburgh
Projects I've supervised before in this space have included: - Using a Raspberry Pi board to implement a guitar effects pedal - Simulating the Hammond Novachord synthesiser as a software plugin - Generative music for an RPG game - Software to teach electric guitar by analysing chords played in real time - Automated analysis of the use of musical modes in game soundtracks If you're interested in doing something in this area please get in touch with me. (I'm not interested in anything related to generative AI.)

Investigating prominent factors affecting E-commerce development in Africa
e-commerce business intelligence data science digital marketing
Usman S Sanusi
Edinburgh
African e-commerce users are expected to surpass half a billion by 2025, a record 40% landmark compared to about 140 million users in 2017 (13%), representing nearly 17% compound annual growth rate. Additionally, Africa leads mobile internet usage with over 13% above global average, as well as nearly 5% more than Asian mobile usage. This indicated a clear indispensability and promising value for mobile approach to online businesses targeting African markets. However, African ecommerce is far from maturity in terms of profitability, as recent reports showed less than 30% of e-commerce start-ups were profitable in the continent, while most of the bigger companies are yet to record a profit in more than a decade. Meanwhile various studies have identified challenges that include underdeveloped infrastructure, logistical constraints and limited payment gateways. This project would study the influencing factors for the successes and challenges affecting e-commerce development in Africa, building regression models of e-commerce development and its turnover index as the contribution to the national GDP. Resultant models would be validated on relevant time series data, while Business intelligence tools including Google Analytics software, to be employed for preliminary investigations on the leading African e-commerce platforms. To drive insight and potentially to provide suggestions on how to advance e-commerce, machine learning toolkit – weka and SPSS software package would be utilized for modelling and statistical analysis of the data.

Sentiment Analysis: Social Media influence on stock market prediction in the developing economies
forecasting data science artificial intelligence e-commerce digital economy
Usman S Sanusi
Edinburgh

State-of-the-art Machine learning in Energy demands and supplies
forecasting artificial intelligence machine learning renewable and electrical energy
Usman S Sanusi
Edinburgh

State-of-the-art Machine learning in Finance
data science artificial intelligence machine learning finance and stocks
Usman S Sanusi
Edinburgh

State-of-the-art Machine learning in Healthcare
data science artificial intelligence machine learning healthcare management
Usman S Sanusi
Edinburgh

Multicultural Inheritance Application Software
Usman S Sanusi
Edinburgh
An application that provides inheritance or estate sharing recommendations to different people according to their different customs or beliefs, as well as sharing based on a few national jurisdictions.

Investigating prominent factors affecting E-commerce development in Developing Nations
Usman S Sanusi
Edinburgh
The project would study the influencing factors for the successes and challenges affecting e-commerce development in a number of developing nations, building regression models of e-commerce development and its turnover index as the contribution to the national GDP. This project would study the influencing factors for the successes and challenges affecting e-commerce development in developing nations, building regression models of e-commerce development and its turnover index as the contribution to the national GDP. Business intelligence tools including Google Analytics (GA) software could be employed for preliminary investigations on the leading African e-commerce platforms upon timely agreement. Leveraging linear model’s capability to produce relationship between a set of independent variables and dependent variable, the project will build robust influence factor regression models of e-commerce development, exploring different implementations of algorithms. These include Classification and Regression Tree (CART), Multivariate Linear Regression, and partial least square regression (PLR). The regression models would be developed based on relevant time series data including indices on access to computers, Internet penetration, mobile phone ownership, population of middle class and levels of financial inclusion amongst others. These indices would be derived primarily from a number of publicly accessible data including those from United State Trade department, World Trade Organization (WTO) and global data platforms such as Statista. To drive insights and potentially to provide suggestions on how to advance e-commerce, machine learning toolkit – weka and SPSS software package would be utilized for modelling and statistical analysis of the data, while GA to provide quick and easy indications on the likely usability problems using non-identifiable and aggregate data.

Talk to the Museum: Multimodal Retrieval Augmented Generation for Museums
multimodal llms retrieval augmented generation generative ai multimedia for heritage
John See
Malaysia
Exploring new interactions between visitors and the museum may be vital to increase the appeal of museums (and heritage sites) for the new generation. Traditionally, museums tend to depend a lot on manually disseminated information such as site experts and tour guides, as well as using specific types of technologies like sensor-based audio/visual guides and on-site augmented reality to make things interesting. Recent advances in AI-based chatbots now present new possibilities to museums. Wouldn't it be fascinating if visitors could ask questions to the chatbot about the historical facts of a specific artifact in the museum? Can visitors ask the chatbot about a certain picture of a specific pattern on the exhibit? This project should explore the use of retrieval augmented generation (RAG), which combines the strengths of traditional information retrieval systems with the capabilities of generative large language models (LLMs), build a demonstrable prototype, and have it evaluated by early users.

Spotting and Recognising Subtle Facial Expressions from RGB-D Data
subtle facial expressions micro-expressions deep learning affective computing
John See
Malaysia
The human face is often the gateway to understanding a person's emotional state. More often than not, humans also possess the sheer ability to conceal their emotions when required but the leakage of such emotions also occurs in the form of subtle expressions (or "micro-expressions"). Research in computing micro-expressions has gained significant interest in the last 10 years due to the availability of public datasets captured from carefully elicited setups. Recently, two newly established large-scale multimodal databases: the CAS(ME)³ and 4DME have presented exciting opportunities for studying the computation of micro-expressions from a new perspective — one that aims to utilise additional depth information on top of the usual RGB data from videos. The aim of computing these micro-expressions is to find viable algorithms to locate the occurrence ("spotting") of the micro-expression in a video sequence, and thereafter, identify its emotion class ("recognition"). This is a research-centric project. The project scope could include designing algorithms for the 'spotting' task, or the 'recognition' task, or both.

HoloWand: A Gesture-based Air Pen for Multiple Device Control
hardware-software interface embedded systems gesture recognition gesture control
John See
Malaysia
This project aims to develop a "magic pen" or "air pen" that incorporates a gyroscope and accelerometer to recognise gestures and/or handwriting "in the air". The wand allows users to control wirelessly various simulated electronic devices through gestures. To scale up the recognition capability, this project will also explore composite gestures i.e. a combination of simple gestures, which allows specific devices with similar actions to be controlled more intuitively, without the need to register a large, diverse set of specific gestures. The project will be assessed on the effectiveness of the gesture recognition module (corresponding to the intended control or action) and the overall usability of the designed wand. *[This is a student proposed project, and it will be co-supervised with Dr Rosalind Deena Kumari]

AI In Digital Health
Talal Shaikh
Dubai
Healthcare is one of the notable industries that has been influenced by the Fourth Industrial Revolution and Healthcare 4.0 is a term that has emerged to resemble this revolution. Healthcare 4.0 is a collective term for data driven digital health technologies such as smart health, mobile health, wireless health, eHealth, online health, medical IT, telehealth/telemedicine, digital medicine, health informatics, pervasive health, and health information system. The revolution in the healthcare industry is already underway, yet, because of the conservative and slower pace of technological adoption by healthcare insiders, as compared to other industries, digitalization in this sector has not been so evident (Pace et al., 2018; Manogaran et al., 2017). Multiple projects can be done in this space. Please get back to me for further discussion.

AR / VR Project for Smart Spaces
Talal Shaikh
Dubai
To be discussed

Authentication via intrabody communication
Talal Shaikh
Dubai
Biometric authentication is simply the process of verifying your identity using your measurements or other unique characteristics of your body, then logging you in a service, an app, a device and so on. The human machine interface (HMI) is a main communication method between human and computer. Through current HMI, a machine receives and accurately responds to the commands instructed by the users. In the next generation of HMI, machines will be required to deal with more challenging problems/decisions (such as affective evaluations, ethical quandaries, and other innovations) in a self-governing manner. We can use Galvanic intrabody communication to transfer data through the human body.

Embodied Agents that Chat
Talal Shaikh
Dubai
Details to be discussed with the student.

Emotion AI
Talal Shaikh
Dubai
Artificial emotional intelligence or Emotion AI is also known as emotion recognition or emotion detection technology. Humans use a lot of non-verbal cues, such as facial expressions, gesture, body language and tone of voice, to communicate their emotions. The vision is to develop Emotion AI that can detect emotion just the way humans do, from multiple channels.

Emotion Recognition in Chats / Videos
Talal Shaikh
Dubai
TO BE DISCUSSED

Energy Aware Software Development
Talal Shaikh
Dubai
Software plays an important role in battery life. The OS, firmware, drivers, and all small components are typically optimized to give better performance and energy efficiency. As the notebook PC (and smaller form factor devices, including tablets and smart phones) become pervasive compute platforms, battery life is becoming increasingly important, particularly with regard to standby or idle time. In addition, as hardware power states become more sensitive, software must be well behaved at idle so it doesn’t needlessly wake components, which would limit battery life. Several case studies presented here show how software “idle” behavior can have a negative impact in this area on Window -based systems.

Energy Efficient Programming
Talal Shaikh
Dubai
Software can influence the energy efficiency of hardware significantly, since all hardware is controlled by software.

Hidden Web Databases
Talal Shaikh
Dubai
To be Discussed with the Student.

Human Activity Detection
Talal Shaikh
Dubai
Being able to detect and recognize human activities is essential for several applications, including smart homes and personal assistive robotics. We perform detection and recognition of unstructured human activity in unstructured environments. There are many different ways this can be We use a RGBD sensor (Microsoft Kinect) as the input sensor or Radio Wave(WIFI) and compute a set of features based on human pose and motion, as well as based on image and point-cloud information.

Indoor Drone Assistant
Talal Shaikh
Dubai
With the rapid advance of sophisticated control algorithms, the capabilities of drones to stabilise, fly and manoeuvre autonomously have dramatically improved, enabling us to pay greater attention to entire missions and the interaction of a drone with humans and with its environment during the course of such a mission. I would be intrested to consider a project dealing with drones.For further discussions please do get in touch with me.

IOT on Blockchain
Talal Shaikh
Dubai
To be discussed

IOT Systems planner
Talal Shaikh
Dubai
a suggestion system which selects the devices required based on the needs of the user, provides the price,possible network maps etc.

ML for finance
Talal Shaikh
Dubai
This project aims to use machine learning techniques such as ensemble learning, convolutional neural networks etc. to predict spot prices for a variety of industries. Machine learning is increasingly used in finance to make predictions as well as to aggregate among existing strategies for making investments over time. We will use various free as well as proprietary data sets to assess the value of our newly developed methods in terms of both profit and risk, and compare them with state of the art techniques. This will also involve developing new “lucky factors” (features) that can be extracted from the data to inform and improve existing and new investment strategies. The expectation is that the work will lead to a conference publication.

Pervasive Authentication
Talal Shaikh
Dubai
Description to be added later.

Projects in Computer Vision
Talal Shaikh
Dubai
I am intrested in Computer Vision projects that could be used for applications like Object identification with Semantic Analysis in real time, Self Driving Cars, Spatial Data Analysis etc.

Robotics Based Projects
Talal Shaikh
Dubai
I am interested in any robotics-based projects or application like:- Self Driving Cars Drones Navigation Robot mapping Robot Interactions Please get in touch with me if you need to discuss on any of these topics.

Text To Image Synthesis
Talal Shaikh
Dubai
Generating photo-realistic images from text is an important problem and has tremendous applications, including photo-editing, computer-aided design, etc. More details to be discussed with the student

User Authentication Via Wifi Signals
Talal Shaikh
Dubai
There has been a growing interest to build the smart indoor environment solution such as smart home or office, which is capable of sensing and responding to the users using WIFI signals only. In this project, i would like to investigate the use of WIFI for robust authentication of the user in different situations.

Visual Attendence Monitiory Sytem
Talal Shaikh
Dubai
To create an student attendance monitoring system that would use Face recognition as one of its major features.

Visual Food Log
Talal Shaikh
Dubai
TO be discussed

Voice User Interface for a Smart House.
Talal Shaikh
Dubai
To be discussed with the student

Parsing with Algebraic Effects and Handlers
Filip Sieczkowski
Edinburgh
Algebraic effects and their handlers are a modern approach to structuring computational effects of programs, including interaction with outside world, but also effects internal to the program. In addition to various libraries that provide the programmer with an ability to use algebraic effects, several experimental programming languages, including Helium (https://bitbucket.org/pl-uwr/helium/src/master/), Frank (https://github.com/frank-lang), Koka (https://koka-lang.github.io/koka/doc/book.html), etc. have been recently developed. These can be used as a means to study the impact of the new programming idiom on software development. This project aims to study the impact of programming with algebraic languages on parsing technology. It would require the student to a) investigate the programming idiom to be used, b) investigate parsing techniques, and decide on a family of techniques suited to implementation using algebraic effects, c) build a tool/library for a chosen language that uses algebraic effects to provide support for parsing. The prerequisites for this project include background in functional programming and strong interest in cutting-edge language technology in this area, as well as a background in language technology that would allow the student to efficiently review and adapt approaches to parsing (around the level given by the Language Processors course). Understanding of formal semantics of programming languages is not strictly necessary, but even limited exposure may be helpful in understanding the research papers that will need to be studied.

Applying statistical and machine learning techniques to study the joint modelling of insurance claims and lapsation for insureds who subscribed to both automobile and homeowners insurance.
Karamjeet Singh K.Ranthir Singh
Malaysia
This project proposes to study the joint modelling of insurance claims and lapsation related to insureds who subscribed to both automobile and homeowners insurance. This is useful to insurers as it may provide valuable insights into the area of rate making hence improving both pricing and underwriting. In the first semester we will look to replicate the study of Guillén et al. (2021). Data is available for policyholders in the Spanish market. In the second semester we shall try to get data from another country (student’s home country, if available) and do a similar study perhaps with some machine learning. We will then make comparisons with the study by Guillén et al. (2021).

An Exploration of Machine Learning Algorithms in Healthcare sector (e.g. Heart Failure, Cancer classification, COPD prediction)
Drishty Sobnath
Dubai
According to the World Health Organisation (WHO), cardiovascular diseases cause around 17.9 million deaths every year, due to an increasing number of heart attacks and strokes. Therefore, this study aims to utilize data science and machine learning techniques to explore the relationships among contributing factors that may have a bearing on the risk of suffering from heart disease. It aims to accurately predict whether a patient is likely to suffer from heart disease based on based on existing health public datasets.

PPE Detection with Computer Vision AI
Drishty Sobnath
Dubai
Industrial and manufacturing are two of the most high-risk sectors for workers. According to the U.S. Bureau of Labor Statistics, there were 2.8 million nonfatal workplace injuries and illnesses in 2019. This includes over 400,000 nonfatal injuries and illnesses in the manufacturing sector. Safety is of paramount concern for both employees and employers at any workplace, and the wearing of Personal Protective Equipment (PPE) like helmets, gloves, masks, vests, etc., by workers is a cornerstone of workplace safety. By analyzing images/videos, the project will explore how computer vision can be used to create safer workplaces by ensuring PPE compliance and detecting any violations in real-time.

Data-Enabled Mental Health: From Patterns to Interventions
Drishty Sobnath
Dubai
According to a new report from a project carried out by Harvard Graduate School of Education, young adults in the U.S. report twice the rates of anxiety and depression as teens. The report identifies a variety of stressors that may be driving young adults’ high rates of anxiety and depression. The proposed project can utilize a mixed-methods approach, including surveys, focus groups, and interviews, and use of machine learning and data science tools to predict or visualize patterns in young people's perceptions and experiences surrounding their mental health.

Artificial Intelligence of Things (AIoT) in Smart Cities for indoor air pollution prediction
Drishty Sobnath
Dubai
The accelerating convergence of artificial intelligence (AI) and the Internet of Things (IoT) has sparked a recent wave of interest in Artificial Intelligence of Things (AIoT). At this point, most of society understands the issue of air pollution and its repercussions not only on the climate but human health as well. However, not many of us seem to realise that indoor air quality is as important. Unfortunately, indoor air is also susceptible to pollution, and as studies show, its presence can be up to 8 times higher than in outdoor air and most people spend around 80 to 90% of their time indoors. By analysing historical data sets, physical, chemical and biological characteristics of indoor air, different models can be evaluated to predict Indoor Air Quality.

Serious Games
serious games
Mario Soflano
Edinburgh
Serious games are a category of games designed with a primary purpose other than pure entertainment. They leverage the engaging and interactive nature of gaming to achieve specific educational, training, health, or social objectives. Here are some key aspects of serious games: 1. Educational Content: Serious games incorporate educational material to teach specific knowledge or skills. This can range from academic subjects to professional training. 2. Engagement and Motivation: By using game mechanics such as points, levels, and rewards, serious games keep users motivated and engaged in the learning process. 3. Interactive Learning: Players actively participate in the learning process, which can enhance understanding and retention of information. 4. Real-World Applications: These games often simulate real-world scenarios, allowing players to practice and apply what they have learned in a safe environment. Serious Games has benefits of: • Enhanced Learning: The interactive nature of games can make learning more effective and enjoyable. • Skill Development: Players can develop a range of skills, from cognitive abilities to social and emotional skills. • Immediate Feedback: Games provide instant feedback, helping players understand their progress and areas for improvement. • Accessibility: Serious games can be accessed on various platforms, making them available to a wide audience. Serious Games can be applied to: • Education: Used in schools and universities to teach subjects like math, science, history, and languages. • Healthcare: Help patients manage chronic diseases, undergo rehabilitation, or learn about health and wellness. • Corporate Training: Provide employees with training in areas such as leadership, safety, and technical skills. • Social Change: Raise awareness about social issues and promote positive behavior change. Serious Games can be implemented for PC / Console, Mobile, Virtual Reality and Augmented Reality

Location-based System
serious games
Mario Soflano
Edinburgh
Location-based systems (LBS) leverage geographic information to provide services and information tailored to a user’s location. These systems have transformative potential in both educational and commercial sectors, enhancing user experiences and operational efficiency. As technology continues to evolve, the potential applications of LBS will expand, offering even more innovative solutions for various sectors.

Digital Health
Mario Soflano
Edinburgh
Digital health encompasses the use of digital technologies to improve health outcomes, healthcare services, and overall well-being. With the aim to enhance the efficiency and accessibility of healthcare Digital Health integrates various technological advancements, including mobile health (for example for remote health monitoring), health information technology (such as Electronic Health Records and Health Data Analytics), wearable devices like Fitness Trackers and Medical Wearables and telehealth such as Virtual Consultations and Remote Diagnostics.

Autoformalization with LLMs and Proof Assistants
Kathrin Stark
Edinburgh
Are you interested in the intersection of machine learning, mathematics, and verification? Do you want to explore how machine learning models and proof assistants can be combined to automatically translate natural language mathematics into formal specifications? Then this honours thesis project is for you! In this project, you will work on testing how well LLMs perform on autoformalization, i.e. translating natural language mathematics into formal language. You will explore the capabilities and limitations, as well as evaluate its performance in comparison to existing methods for autoformalization. You will also have the opportunity to learn about proof assistants and their role in formal verification. If you are interested in this project, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Develop solution strategies for a game or riddle
Kathrin Stark
Edinburgh
The goal of this project is to develop a base solution, and then iteratively improve on this solution while reasoning on the soundness of the improved solution. See for example https://www.cs.tufts.edu/~nr/cs257/archive/richard-bird/sudoku.pdf for a solution of Sudoku in Haskell. I have some ideas, but you are very welcome to bring your own ideas. If you are interested in this project or would like to discuss your own project ideas, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

How to ensure a program has never bugs? (Program Verification)
Kathrin Stark
Edinburgh
From security protocols used in online banking, over embedded control systems, to e-mail and disk encryption: every day, we use software we rely on to be safe and secure. Many things can go wrong: there might be bugs in the program itself or the compiler can produce wrong machine code. Formal verification of a program allows us to prove indisputably – using only a small set of assumptions and deduction rules – that all inputs lead to the desired behaviour. This guarantee is particularly important if faulty software would lead to a significant loss or if the software has to withstand a determined attacker. For realistic programs, verifying rich specifications beyond shallow properties such as no out-of-bound array subscripts in a fully automated way is challenging due to the immense search space. Interactive proof assistants such as Coq or Isabelle allow humans to fill in where fully automated methods fail by allowing such proofs to be developed in an interaction between humans and computers. For this reason, proof assistants have recently gained importance also in industrial applications and are used by companies like Microsoft, Amazon, Apple, and Google. I'm looking for students who are interested in learning about the verification of programs. I also have ideas of variable difficulty from current research projects. Let's talk! If you are interested in this project or would like to discuss your own project ideas, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Optimizing an OCaml Compiler
Kathrin Stark
Edinburgh
Are you interested in compilers and programming languages? Do you want to extend your knowledge in OCaml and optimize a compiler for a real course project? Then this honours thesis project is for you! In this project, you will work on extending the existing OCaml compiler for the F29LP course with several optimizations, such as live variable analysis and dead code removal. You will learn about compiler design and implementation, as well as gain experience with OCaml programming. Your work will involve researching and implementing the optimizations in the compiler, evaluating their effectiveness, and documenting your findings. If you are interested in this project, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Verification of Compiler Optimizations
Kathrin Stark
Edinburgh
Compiler optimization is the process of improving the performance of a compiler-generated code by applying a series of transformations to the code. For realistic programs, verifying rich specifications beyond shallow properties such as no out-of-bound array subscripts in a fully automated way is challenging due to the immense search space. Interactive proof assistants such as Coq or Isabelle allow humans to fill in where fully automated methods fail by allowing such proofs to be developed in an interaction between humans and computers. For this reason, proof assistants have recently gained importance also in industrial applications and are used by companies like Microsoft, Amazon, Apple, and Google. In this project, you will work on proving the correctness of optimization steps in a compiler using the Coq proof assistant. Your work will involve implementing and verifiy the optimization steps in the proof assistant, formalizing their correctness proofs, and documenting your findings. If you are interested in this project or would like to discuss your own project ideas, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Writing a Lexer Generator
Kathrin Stark
Edinburgh
In this project, you will work on writing a lexer generator in a programming language of your choice. Lexer generators are tools that automatically generate the lexical analyzer component of a compiler based on a set of regular expressions. You will learn about compiler design and implementation, as well as gain experience with programming language theory and lexing. Your work will involve researching and implementing the lexer generator, evaluating its effectiveness, and documenting your findings. You will also have the opportunity to explore different programming languages and lexing techniques. If you are interested in this project or would like to discuss your own project ideas, please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Your own project around compilers
Kathrin Stark
Edinburgh
If you have a project idea around compilers/F29LP, I am happy to meet and discuss it. I also have several ideas for suitable projects. Let's talk! Feel free to contact me at k.stark@hw.ac.uk or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/ so that we can discuss a possible project together. ###

Your own project around functional programming
Kathrin Stark
Edinburgh
If you have a project idea around functional programming, I am happy to meet and discuss it. I also have several ideas for suitable projects. Let's talk! If you are interested in this project please contact me at [k.stark@hw.ac.uk](mailto:k.stark@hw.ac.uk) or make a short appointment at https://outlook.office365.com/owa/calendar/KathrinsMeetingCalendar@heriotwatt.onmicrosoft.com/bookings/.

Master Class in anything around Verification/Interactive Theorem Proving/Language Processors
itp; verification
Kathrin Stark
Edinburgh
I'm happy to supervise any student in a master class on any of the above topics. There are many topics available; talk to me if you have an idea or are just interested in one of the topics.

A high level dependently typed dataflow language for hardware
Rob Stewart
Edinburgh
Dataflow languages are commonly used to program embedded systems such as Field Programmable Gate Arrays (FPGAs). Static dataflow models ease reasoning and compile-time scheduling, however their lack of expressivity constrains the programmer's ability to implement complex algorithms. Various approaches have been taken to 1) identify and 2) reason about static dataflow properties of dataflow programs, including static analysis and model checking. This project will take a new approach, which is to implement a static dataflow embedded domain specific language (DSL) in the dependently typed Idris language. This approach will use Idris's type checker to prove static properties of dataflow actors, and will infer data rates and ideal scheduling policies for compilation to FPGAs. A possible implementation plan for the project is: 1) Developing the dataflow EDSL in Idris. 2) Developing scheduling policies within the types of the EDSL. 3) A Verilog backend of the EDSL to target FPGAs.

Add library-level prefetching to Haskell
Rob Stewart
Edinburgh
A recent paper shows that adding prefetching to the implementation of operations over data structures (i.e. container libraries) can yield a significant speedup by hiding the latency of memory access. Here's the paper: "Prefetching in Functional Languages", ISMM 2020. https://www.cl.cam.ac.uk/~tmj32/papers/docs/ainsworth20-ismm.pdf That work was done in the context of the language OCaml. This project will investigate whether adding prefetching to Haskell libraries like the `containers` library can yield speedups in the same way. An additional dimension in the design space for Haskell is its lazy evaluation semantics, which might mean that prefetching benefits are even more significant for Haskell than they are for OCaml. This project will find out.

Haskell memory performance programmer feedback
Rob Stewart
Edinburgh
Haskell is an almost unique language, in the sense that it has lazy-by-default evaluation semantics and it is a pure language (no side effects). Laziness poses challenges for reasoning about memory performance and memory access behaviours. Recent tooling developments have enabled more precise memory profiling of Haskell code: https://well-typed.com/blog/2024/01/ghc-eras-profiling/ This project would evaluate the usefulness of these new tools, and look to extend them by integrating their reports into IDEs for source-code annotations.

Comparing deep learning accelerator hardware
Rob Stewart
Edinburgh
"AI at the edge" allows autonomous devices and smart sensors to perform tasks such as object detection, classification, speech recognition and complex text processing tasks -- in real time and with very low power requirements. This project will compare the performance and usability of two neural network accelerator devices: The a Google TPU on a Coral.AI USB, and an Intel Neural Compute stick USB. If the student wishes to go further, a third comparator would be programming a neural network into hardware fabric with an FPGA.

Compressing neural networks for FPGA based hardware accelerators
Rob Stewart
Edinburgh
For executing deep learning algorithms based on neural networks, there is a shift expensive (finance and energy) centralised GPUs to Edge Computing devices such as embedded CPUs and programmable hardware (Field Programmable Gate Arrays, FPGAs). Trained neural networks require hundred of Megabytes or memory, and high computation resources. Emerging compression techniques are able to reduce those resource costs significantly. However, this also affects the accuracy of neural networks. This project will explore industry driven frameworks e.g. Xilinx's Python FINN framework and Intel's Distiller framework, to assess the speed, accuracy and resource use performance metrics for across a set of neural network models. The outcomes of the project will inform users of neural networks and of embedded processors on how best to construct and refine deep learning models for application domains such as remote image processing smart cameras, autonomous robot, and driverless vehicles.

Implementing Haskell with FPGA hardware
Rob Stewart
Edinburgh
The performance of functional programming language implementations until 15 years ago relied on increasing CPU clock frequencies. The last decade has seen the rise of the multi-core era to overcome this stall. Due to a single connection from multiple cores on a CPU to main memory, general software implementations of parallel programming languages are finding the limits of CPU architectures due to the Von Neumann bottleneck. In very recent times, the fabric on which we compute has changed fundamentally. Driven by the needs of AI, Big Data and energy efficiency, industry is moving away from general purpose CPUs to efficient special purpose hardware e.g. Google's Tensorflow Processing Unit (TPU) in 2016, Huawei's Neural Processing Unit (NPU) in smartphones, and Graphcore's Intelligent Processing Unit (IPU) in 2017. This reflects a wider shift to special purpose hardware to improve execution efficiency. This BSc project will explore alternative approaches to implementing functional languages, by offloading runtime system components onto dedicated FPGA-based hardware implementation. For example special purpose memory hierarchies to minimise memory traffic, prefetching in hardware, or garbage collection in hardware to reclaim memory with minimal latency. The project will work alongside the EPSRC HAFLANG project ( https://haflang.github.io ), where the aim is to develop a processor architecture for managed functional languages to significantly outperform the runtime and energy performance of CPUs. The student would be working alongside the HAFLANG project's postdoc researcher on this project, with frequent meetings with Rob Stewart.

Monitoring protocol compliance of IP hardware blocks
Rob Stewart
Edinburgh
Programmable hardware is increasingly used in many application areas, including Internet of Things and Smart devices, as well as safety critical systems. The correctness of hardware designs is therefore essential. When two hardware components communicate, they must do so via an agreed protocol. An example is the widely used AXI protocol. Many hardware designs are not open source, meaning you can use the generated IP hardware block from a vendor, but cannot verify its implementation by inspecting the hardware description code. How are you to trust a piece of hardware? So instead, one must inspect its communication behaviours to have confidence in its correctness. There is a tool called 'mbac', which generates hardware blocks from temporal properties that describe protocol rules. You can synthesise the generated hardware, meaning you can deploy it alongside closed source IP blocks, and it flags any time these properties are violated. This project will evaluate the mbac tool in terms of its usefulness for widely used protocol standards, its scalability to complex protocols, its effectiveness in catching hardware bugs, and potential practical deployment scenarios to make users aware of hardware bugs.

Profiling Haskell's lazy evaluation
Rob Stewart
Edinburgh
The Haskell functional language is based on lazy evaluation where computation is not performed until their values are required. This has many formal and pragmatic advantages over the more common strict evaluation but carry some runtime overhead. This project will involve systematically profiling lazy, strict and lazy/strict hybrid Haskell benchmark implementations to expose the strengths and weaknesses of Haskell's non-strict evaluation.

Semantic Web type providers in F#
Rob Stewart
Edinburgh
Type providers is an idea in programming languages research, whereby an external data source such as a CSV file is used to generating programming constructs. Typically these components are types (i.e. Java classes, or types in functional languages), but can also be properties or even functions or methods. In the semantic web world, the goal is to provide as much meaning to data as possible. In other words, what is the semantic meaning of an entity from a particular domain, and what is relationship between entities in that domain. This project will investigate whether there is a deep connection between how domains are captured using ontologies from the semantic web, and type systems for functional languages. The student on this project may wish to explore building type providers in their favourite functional language. As a starting point, with support for RDF data to convert ontology schema into types of that programming language. For a richer investigation, the project will explore how ontological rules can be mapped to types and constraints in the target programming language. This project may raise some interesting questions: if type constraints derived from ontology rules are sufficiently strong, can we perform ontology reasoning to check RDF semantic web data is sound and consistent? Can we do this at compile time type provider OWL/schema data sources and RDF data sources? On the practical side of this work, take a look at RDF type providers for the F# language available in the Iride library: https://github.com/giacomociti/iride That may be a starting point for any implementation work.

Special purpose hardware for RDF stream processing
Rob Stewart
Edinburgh
Stream processing is about doing one-pass execution of continuous queries over a potentially infinite stream of values. Streams come with different characteristics, e.g. data rate from low (one or two values an hour) to high (thousands of values a second), and is published either in steady stream or in highly irregular bursts. Many use cases require joining values across streams and stored sources, as well as computing aggregation functions. To support these operations over potentially infinite streams, windowing operators are used to provide a scope for the operation. Additionally with RDF streams, there is also the possibility of performing inferencing over the stream of values, i.e. generating new data based on the content of the stream. All these needs pose interesting challenges for stream processing: how much data needs to be windowed to fire inference rules? And crucially, what processing hardware should be used to support throughput of thousands of RDF data items a second? Field Programmable Gate Arrays (FPGAs) are programmable hardware chips, which when configured have a circuit that is precisely design to meet an algorithmic need. For streaming domains they offer extremely high throughput, and use very little energy. This project will investigate the use of FPGAs for developing special purpose a RDF stream processor. A key hardware design decision will be to decide how "programmable" the hardware is at runtime. In other words should the hardware design allow runtime FPGA programming to (1) switch window operators, (2) change the window size, (3) upload new inference rules. A desirable artefact from this project will be an open source hardware design for a RDF stream processing hardware accelerator.

Static analysis of dataflow program performance
Rob Stewart
Edinburgh
Dataflow languages are good models for programming embedded and configurable hardware architectures, because the distributed memory model maps well to these architectures. There's a wide variety of dataflow models, from static dataflow all the way to dynamic dataflow. Static models are easier to reason about e.g. how much data can be processed, but these programming models are restrictive. Dynamic models can express complex algorithms but reasoning about runtime behaviour is much harder. This exposes a sweet spot: can we expose moderately expressive programming models without losing all ability to reason about performance, or to generate efficient hardware. This project will explore the idea of abstracting properties of dataflow programs, to compute what the throughput performance capabilities of such programs is, without needing to run them. The technology to integrate could be (1) HoCL for programming, (2) Kiter to determine throughput performance and (2) PREESM to turn programs into executables. Possibly also using DIF as an interchangeable dataflow model format. https://github.com/jserot/hocl/ https://github.com/bbodin/kiter https://preesm.github.io https://www.researchgate.net/publication/220714226_DIF_An_Interchange_Format_for_Dataflow-Based_Design_Tools

Targeting FPGAs with parallel functional languages
Rob Stewart
Edinburgh
FPGAs are reconfigurable chips and offer promise of very high performance, low powered targets for accelerated computation. They have potential in many domains including High Performance Computing, Cloud Computing, embedded processing and autonomous robotics. FPGAs are usually programmed at very low levels with hardware description languages, and sometimes at the higher level C language. High level parallel array processing languages like APL, Accelerate and Chapel abstract above hardware, usually targeting multicore CPUs and GPUs. This project will involve writing an FPGA backend for the Accelerate DSL, an embedded language in Haskell, Backend options are OpenCL or C++, from which High Level Synthesis tools will generate FPGA hardware designs. This is a compiler research project that will establish how to produce high performance (outperforming parallel CPUs and GPUs) and efficient hardware from very high level array processing codes.

Verification aware quantisation of neural networks for FPGAs
Rob Stewart
Edinburgh
Neural networks have until recent times predominately been trained and executed on GPUs in data centres. The recent trend is to push trained neural networks to Edge Computing devices, such as smart phones, autonomous vehicles and smart cameras. One hardware architecture in this space are FPGAs, which are programmable hardware chips that can be deployed with a sensor for low powered operation, reaching extremely high throughput. This makes FPGAs ideally suited for data driven AI problems that involves high volumes of input data and where a high throughput of inferences are required. One downside of FPGAs and other Edge Computing AI accelerators, is that they are limited in how much memory they have to store trained neural network parameters. The solution is the compress neural networks, e.g. by quantising the precision of their weights. Doing so has an inevitable effect on inference accuracy, but this is a surprisingly small amount. Another, more serious, effect of quantisation is the loss of robustness a neural network might have against adversarial attack. Currently, quantisation algorithms focus on the trade off between accuracy and memory requirements. This project will instead design a new quantisation algorithm, one that guided by formal verification approaches such that reducing precision of neural network parameters does not affect the robustness of the overall network to adversarial attack. This will be done in the context of the open source Brevitas component in the FINN compiler framework developed by Xilinx Research. This project can be done in close collaboration with Xilinx Research if the student wishes.

Profiling low-level memory access of Haskell
Rob Stewart
Edinburgh
Haskell is an almost unique language, in the sense that it has lazy-by-default evaluation semantics and it is a pure language (no side effects). Laziness poses challenges for reasoning about memory performance and memory access behaviours. Existing profiling tools for Haskell do not measure the latency of memory access, or the contention on the memory bus for parallel Haskell programs. This project will investigate how to use low-level tooling to evaluate the cost of Haskell's properties of (1) laziness and (2) immutability (purity), when it comes to memory access costs. E.g. how long to CPU cores need to stall waiting for code and data from memory, and how much contention is there on the serial memory bus when running parallel Haskell programs on up to 64 CPU cores. This project will involve lifting very low level profiling information into the context of Haskell user code.

Mechanising English writing checks
Rob Stewart
Edinburgh
William Strunk wrote the book "The Elements of Style" in 1918, and it remains influential and invaluable for English writers. It provides "principal requirements of plain English style". It aims to "lighten the task of instructor and student by concentrating attention on a few essentials, the rules of usage and principles of composition most commonly violated." This project will explore to what extent one can mechanise the English writing rules set out in this book. Similar attempts have been made to mechanise "linting" checks as software, e.g. textlint and its vast set of plugins. https://textlint.github.io/ Can William Strunk's book be implemented as one or several textlint plugins? Will users, e.g. academics writing papers and students writing dissertations, value the suggestions made? How accurate is the software to the book's given examples? Which rules in the book cannot mechanised in this way?

BabyLM: Pretraining Language Models with a developmentally plausible corpus
deep learning neural networks language models
Alessandro Suglia
Edinburgh
A huge effort has been put into optimizing LM pretraining at massive scales in the last several years. While growing parameter counts often get the most attention, datasets have also grown by orders of magnitude. For example, Chinchilla sees 1.4 trillion words during training---well over 10000 words for every one word a 13-year-old child has heard in their entire life. The goal of this shared task is to incentivize researchers with an interest in pretraining or cognitive modelling to focus their efforts on optimizing pretraining given data limitations inspired by human development. Additionally, we hope to democratize research on pretraining—which is typically thought to be practical only for large industry groups—by drawing attention to open problems that can be addressed on a university budget.

Understanding and Scaling Pixel-based LLMs
deep learning neural networks language models computer vision
Alessandro Suglia
Edinburgh
Current Deep Learning models of language processing assume to have access to a tokenizer, a tool used to divide the input text into a sequence of tokens that can be more easily processed by Machine Learning algorithms. A tokenizer is built utilising a textual corpus from which it derives the most frequent tokens used in the language. However, these representations have several short-comings: 1) they are specific for each language; 2) they are sensitive to noise (e.g., spelling mistakes); and 3) they are hand-crafted because they do not represent language input in a multimodal way using either the visual or auditory input streams, just like humans do. To overcome these bottlenecks, this project will explore "textless" NLP models that use visual and audio signals to derive latent conceptual representations.

Multimodal Playground: an interactive environment for language learning
deep learning neural networks language models computer vision game design
Alessandro Suglia
Edinburgh
Current large language models learn from static dataset made of text, images or videos. However, a learning agent should be able to learn from the interaction with the world and with other agents. In this project, we will create a simulated environment based on Unity to facilitate learning objects and tasks in a multimodal way.

Developing Vision+Language models for the Healthcare domain
deep learning neural networks language models computer vision
Alessandro Suglia
Edinburgh
The student will investigate recent advances in Vision+Language models and apply them to the challenging domain of medical imaging. The student will work with the supervisor to identify a domain of interest and scope it for the purpose of the master project.

ActionLLM: Developing Large Language Models that Learn To Use Tools
deep learning neural networks language models
Alessandro Suglia
Edinburgh
Current large language models like ChatGPT and Bard have remarkable abilities to generate very fluent natural language and they have achieved impressive performance on several benchmarks. However, for many relevant tasks in the real world, we have to make sure that these models can also interact with external sources (e.g., Wikipedia or search engines) as well as other external tools such as calculators, databases and so on. In this project, we will explore the field of large language models that can use external tools to achieve very specific goals and tasks.

Learning to reason over long trajectories for Embodied AI tasks
deep learning neural networks embodied ai robotics
Alessandro Suglia
Edinburgh
In this project, we will explore efficient deep learning architectures for training artificial agents that can learn to execute actions in the environment. For instance, we will investigate whether it's possible to train models that can learn to play ATARI games.

Generative AI/Large Language Models for Robotics
generative ai robotics deep learning
Alessandro Suglia
Edinburgh
Explore state-of-the-art Machine Learning and Generative AI for real-world robotics tasks. One of the main objectives is to create a solution that can be deployed in the Home-Stretch 3 robot that we have at the National Robotarium: https://hello-robot.com/stretch-3-product

Open-Endedness is Essential for Artificial Superhuman Intelligence
deep learning neural networks language models ai
Alessandro Suglia
Edinburgh
In recent years there has been a tremendous surge in the general capabilities of AI systems, mainly fuelled by training foundation models on internet-scale data. Nevertheless, the creation of open-ended, ever self-improving AI remains elusive. In this position paper, we argue that the ingredients are now in place to achieve open-endedness in AI systems with respect to a human observer. Furthermore, we claim that such open-endedness is an essential property of any artificial superhuman intelligence (ASI). We begin by providing a concrete formal definition of open-endedness through the lens of novelty and learnability. We then illustrate a path towards ASI via open-ended systems built on top of foundation models, capable of making novel, human-relevant discoveries. We conclude by examining the safety implications of generally-capable open-ended AI. We expect that open-ended foundation models will prove to be an increasingly fertile and safety-critical area of research in the near future.

Vision and Language Models for Plant Disease Detection
deep learning neural networks language models computer vision
Alessandro Suglia
Edinburgh
Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path toward smartphone-assisted crop disease diagnosis on a massive global scale.

LLM-based Copilot for Canvas
deep learning neural networks language models generative ai
Alessandro Suglia
Edinburgh
Generative AI solutions have powered many breakthroughs in several fields of AI. In this project, you will leverage Generative AI models and tools, to define a private and safe Generative AI copilot for teachers that is integrated as part of the Canvas system. One use case that we would be interested in is the automatic generation of assessment material based on the material available on Canvas (e.g., presentations, documents, etc.). More technically, we will explore agentic LLM solutions to approach this problem. We will use dedicated libraries such as Microsoft AutoGen or LlamaIndex to build an autonomous agent that can satisfy the queries of the user (i.e., teacher) and support them in generating material.

Multi-agent AI Systems using LLM
multi-agent systems llms generative ai deep learning
Alessandro Suglia
Edinburgh
Traditional Natural Language Processing systems have been designed with a single task in mind such as sentiment analysis, text classification, etc. Thanks to Large Language Models like ChatGPT, these AI systems can now solve multiple tasks depending on the text prompt that they receive. However, a single AI agent wouldn't be sufficient to solve very complex and multi-step tasks required by intricate real-world applications. In this project, you will work with the most recent LLMs and combine them in novel ways to create complex multi-agent applications.

Interpreting pixel-based large language models
deep learning; large language models; generative ai
Alessandro Suglia
Edinburgh

Improving usability of the π-base project (a database of topological spaces and theorems)
Dash Swaraj
Edinburgh
π-base is an extensive database of topological spaces and theorems. It is of use to mathematicians looking to find: 1. examples of topological spaces satisfying certain properties, 2. theorems that can help them deduce other properties of their spaces, and 3. counter-examples to conjectures. Currently it lists theorems such as "if your space has property A then it has property B", "if your space has property ¬C & D then it has property B", etc. It would be nice to add new functionality to this website so that a user can list a property they want to prove and it can give them a tree of all the sequences of theorems they may need in order to prove it. This would be a useful project for someone interested in typescript, web programming, and someone wanting to contribute to an open source project.

Talking Hairy Coo
rag llm speech-to-text text-to-speech openai
Ian Tan
Malaysia
(Can be 1 or 2 students, and will work out the scope accordingly) This project is to develop an audible talking chatbot that you interact with using your voice instead of text. The chatbot is to be powered by a text based Retrieval Augmented Generation (RAG) application but with the interface being speech recognition (using OpenAI speech-to-text) and responding using text-to-speech (also OpenAI text-to-speech). In summary, it is a project that is: i) Voice-powered: It uses speech recognition technology to understand what you say. ii) Conversational: It can respond to your questions and requests in a natural, spoken way, like having a chat with a real person. iii) Interactive: You can have a back-and-forth conversation with the chatbot, asking follow-up questions or providing additional information.

Automated Essay Grading using Fine-Grained Linguistic Features
essay grading machine learning dimensionality reduction feature extraction
Ian Tan
Malaysia
The intricate relationship between fine-grained linguistic features and writing quality is to be reproduced. This is based on a recent paper titled "Incorporating Fine-Grained Linguistic Features and Explainable AI into Multi-Dimensional Automated Writing Assessment" by Tang et al. (2024). The paper harnessed computational analytic tools and Principal Component Analysis (PCA) to distill and refine linguistic indicators for model building and construction. This project is to reproduce the work done by Tang et al. and is to be scoped accordingly, likely to be with a subset of the feature extraction tool, and without the explanable AI component.

Report Writer for Authors' Publications
scopus information extraction api report writing publication
Ian Tan
Malaysia
This project is to use the Elsevier Developer Product API to extract a list of authors' publications based on a range of dates and either develop or use a reporting tool to allow for flexible output format, including MS-Excel readable formats. In summary: 1) An administrator interface, which will include the management of authors 2) Extract using the Elsevier API (and from Google Scholar or other indexing) using a selected list of authors 3) Store the raw format 4) Allow a configurable report output format based on the raw format

Real and Fake Face Images Detection using Machine Learning
Nurul Ain Toha
Malaysia
This project focuses on developing a machine learning model to accurately detect real and fake face images using the "Real and Fake Face Detection" dataset from Yonsei University on Kaggle. With the rise of deepfake technology, which creates hyper-realistic synthetic faces through methods like GANs (Generative Adversarial Networks), this project aims to build a model that can distinguish between authentic and AI-generated faces. The model will utilize deep learning, specifically Convolutional Neural Networks (CNNs), and transfer learning techniques to enhance performance. By addressing the growing need for digital media authenticity, the project contributes to combating the misuse of AI in generating fake images.

Data Science Project
Adrian Turcanu
Dubai
To be updated.

Event-B models of Spiking Neural P systems
Adrian Turcanu
Dubai
Event-B is a modelling language that can be used to specify mathematical models of transitional systems. Spiking neural P systems are the result of introducing the idea of neurons into membrane computing. The main goal of this project is to give a methodology to obtain the Event-B model of a SN P systems, to apply it on some examples and to verify the properties of the model using Pro-B, a model checker integrated into a platform called Rodin.

Using model checking on Event-B models of non-deterministic algorithms
Adrian Turcanu
Dubai
Event-B is a mathematical modelling language that can be used to model various transitional systems. The aim of the project is to develop a methodology for constructing Event-B models of non-deterministic algorithm, to apply it on several case studies and to investigate these by using the model checked ProB.

Cryptography Project
Adrian Turcanu
Dubai
TBU

Employing Robotic Process Automation (RPA) for Streamlining a Business Process
robotic process automation rpa
Cristina Turcanu
Dubai
RPA bots could be implemented in areas as diverse as finance, compliance, legal, customer service, operations, and IT (student’s choice). Students can choose to implement attended automation, i.e., software robots that can work alongside humans to share the workload in real-time. In case of unattended robots, they should be scheduled to handle long processes or automations without the need of human interaction. Prove the usability of RPA in several use-cases (https://docs.uipath.com/robot/docs/attended-vs-unattended).

Process Mining and RPA: benefits and challenges in the Industry 4.0
robotic process automation rpa process mining
Cristina Turcanu
Dubai
A comparative study of Process Mining and RPA, highlight the benefits of putting them together, as well as the key differences between these two. Provide an overview of the context and end-to-end viewpoint to enhance processes and ensure the successful delivery of automation-driven outcomes. Research how process mining helps identify the most suitable processes for RPA The project should contain some use cases.

Expand automation by using Robotic Process Automation combined with Machine Learning models.
rpa robotic process automation machine learning ml
Cristina Turcanu
Dubai
Highlight the benefits of using Automation (RPA) combined with ML models. The research should involve some use cases. Explain Intelligent Process Automation.

Process Mining powered by Machine Learning
process mining machine learning
Cristina Turcanu
Dubai
Highlight the benefits of using Process Mining combined with ML models. The research should involve some use cases.

Voice conversion application
Adrian Turcanu
Dubai
TBC

Machine Learning Applications
Cristina Turcanu
Dubai

Blockchain Applications
Cristina Turcanu
Dubai
Investigate the use of blockchains to different sectors of activity, e.g - Lending - Real estate (maybe with cryptocurrency payments) - Voting - Data storage - other

Assessing the Impact of Climate Change on Insurance Claims Using Advanced Statistical Approaches
George Tzougas
Edinburgh
Climate change-induced weather hazards are significantly straining property insurance, resulting in extensive damage and substantial claims, particularly in vulnerable regions. A recent report by the European Insurance and Occupational Pensions Authority (EIOPA) reveals that many non-life insurance businesses adjust their pricing annually based on recent events to implicitly consider climate change, given their short-term contract duration. However, as the report also highlights, this approach may have detrimental consequences. Specifically, the sector requires foresight to understand the im-pact of climate change, anticipate higher premiums, prioritize adaptation and mitigation measures, and continuously monitor trends for informed decision-making. Government interventions to share the premium burden are particularly crucial, especially for economically disadvantaged policyholders, to ensure equitable access to insurance and enhance overall societal resilience against climate risks. This project will enhance conventional regression models by incorporating advanced statistical techniques to develop more accurate climate-related claim frequency and severity models for the property insurance market. Unlike traditional models, which often overlook detailed risk factors, our approach will include specific property and content risks while exploring their complex interactions with weather-related hazards. This integration is expected to improve the prediction accuracy of claim numbers and costs, providing a more robust framework for managing the increasing risks associated with climate change and offering greater adaptability to evolving conditions. The proposed approach is expected to significantly enhance predictive performance, enabling actuaries to ensure that premiums accurately reflect the evolving weather-related risks.

Computer Vision related Projects where you process Image or Video Data and use ML for classification or prediction
Md Azher Uddin
Dubai
Computer Vision

Depression Analysis from audio or video data
Md Azher Uddin
Dubai
Computer Vision

Dynamic Facial Emotion Recognition from video data
Md Azher Uddin
Dubai
Computer Vision

Dynamic Scene Understanding from video data
Md Azher Uddin
Dubai
Computer Vision

Indian Hand Sign Language Recognition from video data
Md Azher Uddin
Dubai
Computer Vision

Healthcare Research Project
Md Azher Uddin
Dubai
Healthcare analysis

Video Annotation
Md Azher Uddin
Dubai
Computer Vision

Automated Paddy Disease Classification Using deep and handcrafted Feature Fusion
Md Azher Uddin
Dubai

Handwriting-Based Gender Classification Using deep and handcrafted feature fusion
Md Azher Uddin
Dubai

Flood prediction system using machine learning approach
Abrar Ullah
Dubai
Designing a flood prediction system using machine learning approach

Hybrid Authentication for eCommerce Applications
Abrar Ullah
Dubai
Research and develop a hybrid authentication approach for security of eCommerce application.

Service Oriented Architecture for Smart homes
Abrar Ullah
Dubai
Service Oriented Architecture for Smart homes

Developing and Fine Tuning LLM
Abrar Ullah
Dubai
Developing a Q & A system on Biomedical data using Large Language Model (LLM). An existing pre-trained model can used and fine tunned on new data.

Emotion-Aware Interfaces for Student Wellbeing Monitoring in Online Classes
Abrar Ullah
Dubai
Emotion-Aware Interfaces for Student Wellbeing Monitoring in Online Classes

Automating Software Testing Using AI and Machine Learning Techniques
Abrar Ullah
Dubai
Automating Software Testing Using AI and Machine Learning Techniques

Optimizing Cloud-Native Applications for Scalability and Performance
Abrar Ullah
Dubai
Optimizing Cloud-Native Applications for Scalability and Performance

Improving Legacy Systems through transformation from Monolithic to Microservice Architecture
Abrar Ullah
Dubai
Improving Legacy Systems through transformation from Monolithic to Microservice Architecture

Meta Model for Enterprise Security
Abrar Ullah
Dubai
Meta Model for Enterprise Security

Deep Learning-based Characterisation of Protein Aggregation in amyotrophic lateral sclerosis (ALS) - Collaboration with University of Aberdeen
machine learning deep learning medical imagen
Marta Vallejo
Edinburgh
Amyotrophic lateral sclerosis (ALS) is a rapidly debilitating neurodegenerative disease that affects motor neurons. Patients develop progressive muscle weakness, leading to death due to respiratory failure, which typically occurs after 3–5 years of symptom onset. ALS affects 1.75 – 3 out of 100,000 individuals per year. The existence of protein aggregates (TDP-43) in affected motor neurons is still a poorly understood hallmark. This project aims to increase the understanding of these structures by using a real clinical Immunohistochemistry dataset collected by the University of Edinburgh and applying different machine learning techniques to extend the understanding of the TDP-43 aggregates at an individual level. This project aims at characterising in more detail how distinct species of aggregates and their distribution are presented in different cells and different patients.

Multimodal Deep Learning diagnosis applied to clinical assessments in Parkinson’s disease - Collaboration with York University
machine learning deep learning sensor data
Marta Vallejo
Edinburgh
Parkinson’s Disease (PD) is a neurodegenerative disease of high incidence in the ageing population. This project aims at the application of deep learning technologies to a clinical dataset that contains information on patients with prodromal or early-stage PD. By analysing and processing digitalised movement data captured by three standard clinical assessments, the classifier will be expected to characterise bradykinesia, a slowing of movement, which is the fundamental motor feature of PD. The complex nature of bradykinesia makes it difficult to reliably identify it, particularly at the early stages of the disease (Ahlrichs and Lawo, 2013). The types of clinical assessments used in this study are the following: * Finger tapping * Hand pronation-supination * Hand opening-closing * Hand movements measured by accelerometers

Approaching a multi-label classification problem for the diagnosis of ear diseases using machine learning techniques
machine learning deep learning medical imagen
Marta Vallejo
Edinburgh
Traditional methods of diagnosing ear diseases often involve manual interpretation of clinical symptoms, which can lead to subjective results and delays in accurate treatment. This project seeks to develop a robust and accurate diagnostic tool that can analyse a range of ear-related symptoms to identify and classify people suffering from various ear diseases at the same time. The project uses a medical image dataset collected in Chile to train advanced machine learning models to recognise patterns indicative of different ear conditions, including otitis media, otitis externa, and so on. The success of this project has the potential to lead to a journal publication by providing results that improve patient outcomes, reduce misdiagnoses, and provide a reliable and accessible tool for healthcare professionals. This project is in collaboration with Dr Fernando Auat (Harper University).

Health Data Visualisation and Monitoring with Extended Reality
extended reality data visualisation wearables hololens 2 healthcare
Marta Vallejo
Edinburgh
Description: Develop an application that visualises health data from wearables (to be decided), such as heart rate and blood pressure, in real-time through Microsoft Hololens 2. This system could help users keep track of their health status and make informed decisions or be used for carers to monitor home patients (elderly or other frailty groups). This project will also be supported by Dr Alistair McConnell (alistair.mconnell@hw.ac.uk).

Detecting COVID-19 Through Tongue Image Analysis Using Advanced Neural Networks
machine learning deep learning medical imagen segmentation data augmentation
Marta Vallejo
Edinburgh
COVID-19 is an infectious disease that typically presents with mild to moderate respiratory symptoms that, in severe cases, can lead to pneumonia or even death. This underscores the critical importance of non-invasive, cheap early detection methods. In a previous project, the YOLOv8 neural model was trained with real tongue images captured by clinicians using smartphones. The aim was to register the area of interest and standardise the dataset using semi-supervised learning techniques. A very basic convolutional neural network was implemented, yielding promising initial results. Project Objectives: Based on last year's outcomes, the extension of this project includes the following key objectives: 1.- Final Model Implementation: Develop and implement the final classification model(s) to evaluate the suitability and performance of the dataset. 2.- Data Augmentation Techniques: Create and apply relevant data augmentation techniques to enhance the robustness of the model and ensure a balanced dataset. Optional Enhancements: 3.- Registration Model Improvement: Refine the existing registration model to increase accuracy. 4.- Front-End Application Development: Design and implement a user-friendly front-end application to facilitate the use of the model in real-world scenarios. If the output of the project is satisfactory, it is encouraged to publish the results in a journal or conference paper. This project is in collaboration with Dr Fernando Auat (ISSS/EPS).

The Pro-Act Dataset Exploring Machine Learning Opportunities in ALS Research
machine learning deep learning healthcare
Marta Vallejo
Edinburgh
The Pro-Act (Pooled Resource Open-Access ALS Clinical Trials Database) stores a wealth of data crucial for understanding Amyotrophic Lateral Sclerosis (ALS). This project aims to identify promising research questions and design proof-of-concept machine learning models utilising the Pro-Act dataset. The study of these research questions could uncover key insights into ALS progression, prognosis, and treatment response, culminating in the development of models and showcasing the potential of machine learning and the Pro-Act dataset in advancing ALS research. Objectives: 1.- Conduct exploratory data analysis to understand the structure and characteristics of the Pro-Act dataset. 2.- Identify research questions relevant to ALS prognosis, disease progression, and treatment response. 3.- Design and propose machine learning models to address the identified research questions. 4.- Develop a proof-of-concept machine learning model using a subset of the Pro-Act dataset.

Exploring Attention and Cognitive Responses During Walking Using Pupil Labs "Pupil Invisible" Glasses
machine learning data analysis computer vision eye-tracking technology pupil labs glasses visual attention.
Marta Vallejo
Edinburgh
Attention and cognitive responses are crucial aspects of human behaviour, especially during tasks requiring simultaneous motor and cognitive functions, such as walking. The Pupil Labs "Pupil Invisible" glasses offer a unique opportunity to capture first-person video and eye-tracking data, providing detailed insights into these processes. The primary objective is to gather insights into how individuals of various age groups navigate and respond to their environment. Secondary objectives include comparing attention levels across different age ranges, identifying environmental factors influencing cognitive responses, and developing a comprehensive dataset for future research on attention and mobility. The expected outcomes of this project include a detailed analysis of how different environments affect attention and cognitive responses during walking, insights into age-related differences in attentional focus and distraction, and a valuable dataset for future research on mobility and cognitive health.

Digital Twins for Optimal Indoors Multi-Sensor Placement
Marta Vallejo
Edinburgh
Social care throughout the UK is under unprecedented pressure due to our ageing population. It is essential to facilitate elderly people's longer-term living independently at home. Internet of Things (IoT) and artificial intelligence (AI) technology can be used to automatically ensure their safety by using sensors that could monitor and alert them about different risks. For instance, it has become possible to determine whether a person is present, detect falls and abnormalities in behaviours, and the misuse of appliances like gas stoves or fires. However, there are still some issues that need to be overcome in order to implement such systems more practically. Careful sensor selection and placement are critical issues in the design of an effective health monitoring system. An effective deployment requires sensor selection and placement that ensures adequate coverage. Ideally, an optimal sensor placement is reachable such that the deployment cost is minimised but the level of protection afforded is maximised. However, in some cases, the location of sensors is restricted and integrated into hubs. Meanwhile, individual sensors need to be dispersed throughout the environment to maximise performance. This project proposes a solution using digital twins of a set of objective homes and a corresponding set of sensors and/or robots. Their optimal location should be determined using multi-objective optimisation techniques. The developed digital solutions will be finally validated and deployed in the Lara lab (National Robotarium). The developed tool should allow for uploading/creating environment layouts through a (simple) user interface. It should also automatically select appropriate sensors based on desired targets (e.g., user location or specific activities) and subsequently optimise the placement of those sensors.

Advanced deep learning methods for virtual H&E staining with fluorescence lifetime imaging microscopy - In collaboration with University of Edinburgh
deep learning lung cancer
Marta Vallejo
Edinburgh
H&E staining, the “gold standard” for cancer detection and diagnosis, is a routine test in clinical cancer pathology. Previous work has shown that fluorescence lifetime microscopic images can be directly translated into virtual H&E staining with superior quality (Wang, 2024), allowing rapid and precise cancer diagnosis without requiring the conventional tissue staining procedure. This project aims to explore advanced deep-learning methods for optimal virtual H&E staining, which includes but is not limited to, vision transformers or diffusion models. Wang, Q., Akram, A.R., Dorward, D.A., Talas, S., Monks, B., Thum, C., Hopgood, J.R., Javidi, M. and Vallejo, M., 2024. Deep learning-based virtual H& E staining from label-free autofluorescence lifetime images. npj Imaging, 2(1), p.17.

Exploring Machine Learning Models to Uncover Pathways in ALS Pathogenesis Using Immunohistochemical Features
machine learning medical data
Marta Vallejo
Edinburgh
This project invites students with a machine learning background to investigate the complexities of ALS pathogenesis by applying diverse machine learning models to an existing dataset. The study involves patients with a specific ALS-related mutation (C9orf72 HRE), whose data includes thousands of post-mortem tissue images with quantified immunohistochemical markers for microglial activation and protein misfolding. Using features extracted from a previous study, you will assess model performance and predictive accuracy using methods beyond the random forest approach originally applied. They will experiment with advanced algorithms such as support vector machines, gradient boosting, and neural networks to identify relationships within the dataset and to investigate which features or feature combinations best classify disease status and predict clinical outcomes. By implementing and comparing different machine learning models, students will gain insight into feature importance and model interpretability in biomedical data, with a focus on neurodegenerative disease applications. This project offers a hands-on opportunity to contribute to the understanding of ALS clinical heterogeneity and to test innovative model approaches, with the potential to inform future trial designs and therapeutic strategies for ALS.

Learning to play games from experts
Patricia Vargas
Edinburgh
In this project, we will assess the ability of classification models to create an intelligent agent for video-game playing by learning from expert data. The idea consists of first capturing a sample of play sessions from expert players to create a training data set. Next, we will apply different Machine Learning models and a Symbolic Classification model to create an intelligent agent that mimics the actions of the expert player and evaluate the extrapolation abilities for later stages. We will also evaluate different approaches that help to improve the extrapolation abilities of the model and assess the performance of the agent by how far they can play the game. Additionally, we will debug the symbolic models to understand the agent behaviour and improve their performance.

Federated Learning for Social Robots performing Activity Recognition in Ambient Assisted Living environments.
python machine learning basics at least one machine learning framework
Patricia Vargas
Edinburgh
In the context of assistive technologies for the elderly or people with disabilities, intelligent environments have been designed to empower this public with more autonomy in daily activities. To this aim, sensors and actuators embedded in the environment or in social robots might be orchestrated to produce helpful behaviours. In any case, deploying this type of technology requires that the system identifies the state of the environment and the users. This understanding can be achieved through activity recognition methods, many of which are presented in the literature with good results in several applications. However, these methods usually require data from several users to be concentrated in a centralised computational base to train machine learning models. This requirement, especially considering modalities such as videos or audio, raises ethical and legal concerns regarding data privacy. In this work, we propose to train the models locally for each participant using a Federated Learning approach to induce models based on public datasets such as the HWU-USP dataset. This approach preserves privacy because it does not require that the data is transferred out of the user’s environment, only partially trained models. Metrics such as time elapsed and accuracy will be evaluated.

Neurorobotics Models to Understand the Neural Mechanisms Underlying Parkinson’s Disease.
Patricia Vargas
Edinburgh
This project is part of a wider project and aims to contribute to the development and understanding of a Neurorobotics model of Parkinson’s Disease (PD), the Neuro4PD project (http://www.macs.hw.ac.uk/neuro4pd/). You will work with a neurorobotics model of PD currently being developed by our team with the goal of further understanding its computational properties. In particular, your work will investigate the computational capabilities of the BG-T-C loop in generating diverse control signals that can be exploited in robotics tasks. Then, you will compare neural computations in the BG-T-C loop with and without a robotic body. Overall, your work will be a further step in understanding the neural mechanisms underlying PD.

Neurorobotics Models Uncovering Neural Sensory Processing and Muscle Control.
Patricia Vargas
Edinburgh
This project is part of a wider project and aims to contribute to the development and understanding of a Neurorobotics model of Parkinson’s Disease (PD), the Neuro4PD project (http://www.macs.hw.ac.uk/neuro4pd/). In this project, you will work with a neurorobotics model currently being developed by our team with the goal of further understanding how complex motor control can be accomplished from realistic computational neural models embedded in simulated and real robots engaged in challenging tasks. In particular, you will explore current neuroscience findings with the goal of translating core neural mechanisms to practical robotic controllers. Overall, your work may also lead to novel AI architectures for embedded systems.

Trustworthy serverless machine learning on heterogeneous and distributed data and devices
machine learning deep learning federated learning ai computer vision nlp multi-modality iot
Chengjia Wang
Edinburgh
Deep convoloutional networks have been widely deployed in modern cyber-physical systems performing different visual classification tasks. As the fog and edge devices have different computing capacity and perform different subtasks, models trained for one device may not be deployable on another. Knowledge distillation technique can effectively compress well trained convolutional neural networks into light-weight models suitable to different devices. However, due to privacy issue and transmission cost, manually annotated data for training the deep learning models are usually gradually collected and archived in different sites. Simply training a model on powerful cloud servers and compressing them for particular edge devices failed to use the distributed data stored at different sites. This offline training approach is also inefficient to deal with new data collected from the edge devices. To overcome these obstacles, in this project, a heterogeneous brain storming (HBS) method is implemented and developed for object recognition tasks in real-world Internet of Things (IoT) scenarios. This method enables flexible bidirectional federated learning of heterogeneous models trained on distributed datasets with a new “brain storming” mechanism and optimizable temperature parameters.

AI-aided drug discovery using graphical neural network: Retrosynthesis with simulated restriction
ai machine learning graphical neural network deep learning drug discovery nlp
Chengjia Wang
Edinburgh
As a fundamental problem in chemistry, Retrosynthesis is the process of decomposing a target molecule into readily available starting materials. It aims at designing reaction pathways and intermediates for a target compound. The goal of artificial intelligence (AI)-aided retrosynthesis is to automate this process by learning from the previous chemical reactions to make new predictions. Although several models have demonstrated their potentials for automated retrosynthesis, there is still a significant need to further enhance the prediction accuracy to a more practical level. This project aims to review, implement and review existing retrosynthesis methods and their potential applications.

XAI in the Prediction of COVID-19 Clinical Outcome
machine learning neural network deep learning ai computer vision explainability
Chengjia Wang
Edinburgh
SARS-CoV-2 pandemic has more than 1.6 million deaths worldwide by the end of 2020 and has overwhelmed health care resources in most countries. Medical imaging, especially chest CT and X-ray techniques have played critical roles in the diagnosis and treatment planning of COVID-19. In the past two years, a large number of deep learning methods have been proposed to: 1. assist the analysis and post-processing of chest imaging data; 2. predict the possible clinical outcomes and development of disease; 3. predict the spreading speed and pandemic status in human society; 4. etc.. This project will develop deep learning methods that can directly benefit the efficiency and accuracy of clinical analysis for COVID-19 using the available public challenge dataset (the STOIC2021 competition: https://stoic2021.grand-challenge.org/stoic2021/). Then focus on the analysis and assessment of explainability of currently mainstream models (ResNet variations: ConvNeXt, Transformer, gMLP, GNN, etc.) Purposes and milestones: Specifically, this project will develop DL models for to predict: 1. Predict COVID19 positivity. 2. Predict occurance of severe COVID-19 cases, defined as intubation or death within one month from the acquisition of the CT scan (metric: AUC). Milestones of this project: 1. A simple ConvNeXt model applied to STOIC2021 data and produce result (successfully submit to the competition) 2. review, collect, implement and compare the SOTA deep learning models on STOIC2021 models, you may use some extra data 3. review and implement different ways to assess the explainability of different models 4. Assess the explainability for different models

AIGC fashion design based on state-of-the-arts generative models and/or image-to-3D algorithm with GUI design
ai aigc stable diffusion machine learning generative model creative fashion design
Chengjia Wang
Edinburgh
Tasks distributed to 4 students: 1. Dataset collection, diffusion model finetuning, model evaluation 2. Inject guidence information to 2D diffusion model and compare the guided model to the unguided ones, simple 2D GUI deisgn by modifying SD WebUI 3. Implement and explore algorithms for image-to-3D rendering, model evaluation, simple 3D GUI deisgn by modifying 3D WebUI 4. AIGC model comparison, workflow optimization, evaluation of diffusion models

Finetune large generative models using domain-specific data
ai aigc stable diffusion machine learning generative model creative fashion design
Chengjia Wang
Edinburgh
Modifying the stable diffusion model with its webui for 2D creative design: injecting guidance information, collect simple datasets, evaluate the models.

Zero-shot or few-shot learning for single image to 3D object generation
ai aigc stable diffusion machine learning generative model creative design
Chengjia Wang
Edinburgh
Review existing 2D-to-3D image generation and rendering methods, integrate it to existing stable-diffusion-webui

Finetuning of large vision generative model using domain specific data
ai aigc stable diffusion machine learning generative model creative design
Chengjia Wang
Edinburgh
1. data collection (simple) 2. diffusion model finetuning 3. design the optimal workflow for specific applications 4. evaluate the model

Image generation using stable diffusuion and natural language prompt using CLIP
ai aigc stable diffusion machine learning generative model
Chengjia Wang
Edinburgh

PCN: predictive coding network
ai machine learning network architecture vision nlp deep learning
Chengjia Wang
Edinburgh
Predictive network is not as famous as the CNN, transformer, diffusion and Mamba models, yet have unbeatable advantages in modern computing system. The aim of this work is to implement PCN to solve one simple vision or NLP task (or processing other types of serial data), and further discover possible approaches to improve its performance and robustness. You need to have a prior knowledge about deep learning models, such as, CNN, MLP, transformer, attention, etc to conduct this research.

Mamba in Vision
ai machine learning network architecture vision nlp deep learning mamba
Chengjia Wang
Edinburgh
Mamba is a new deep learning model originally designed for processing series data, but soon gained its popular in vision tasks. In this work the student will review the newest published works and implement the extended mamba models to solve image recognition problems.

Vision GNN
ai machine learning network architecture vision nlp deep learning graphical neural network
Chengjia Wang
Edinburgh

Deep learning model: KAN
Chengjia Wang
Edinburgh

Blockchain and smart contract with chainlink
Chengjia Wang
Edinburgh

Comfy UI vs Webui Stable diffusion development
Chengjia Wang
Edinburgh

Best Fashion Design workflow with open source AIGC
Chengjia Wang
Edinburgh
Collect data, design workflow, finetuning stable diffusion (with Lora), and make it work with a clean GUI

Incremental and scalable machine learning systems
Chengjia Wang
Edinburgh
A swarm of small AI models combined to form a LEGO-like model which can beat a large AI model in complicated computer vision or NLP task. More robust and flexible than conventional deep networks.

Reinforcement learning or reverse reinforcement learning for legged robots in IssacGym
Chengjia Wang
Edinburgh

Toward Reliable Drug-Target Interaction Predictions in Out-of-Distribution Data Scenarios
Chengjia Wang
Edinburgh
Given the increasing complexity of drug-target interaction (DTI) predictions and the challenges posed by out-of-distribution (OOD) data, this project will address this issue.

On going Grandchallenge or Kaggle competition on AI
Chengjia Wang
Edinburgh
We organise student teams every year to attend new AI and data science competitions on Grandchallenge, Kaggle, Codalab, and Synapse.

An Eclipse front end for the Skalpel type error explainer
Joe Wells
Edinburgh
Skalpel helps explain type errors in computer programs. The project would build an additional front-end (user interface) for Skalpel by making it usable from within Eclipse. Because support for SML in Eclipse is probably not the greatest, this project could reasonably include general work improving this support.

assemble and test the Isabelle/Isar proof language definition
Joe Wells
Edinburgh
Isabelle/Isar is a widely used proof assistant and proving environment for formal verification. There is no single place where a complete and up-to-date definition of the Isabelle/Isar input language can be found. Some of the pieces are in research publications, some pieces are in PhD dissertations, some pieces are in software documentation, and some pieces are in the Isabelle/Isar source code. And only the source code is certain to be up-to-date. This project is about gathering the pieces, assembling them, and writing some tests to confirm that the definition that the project synthesizes is correct.

automatically gather samples of certain mathematical notations
Joe Wells
Edinburgh
The supervisor of this project is trying to develop general theories of mathematical texts. As part of this, it is necessary to see what computer scientists, logician, mathematicians, etc., actually write. Search engines are great for finding documents by the words or phrases they use. However, they are not much use for searching for instances of BNF-like notation (M, N ::= x | lam x. M | M N), or set comprehensions ({ x | exists y. (x,y) in S }), or ellipses (x = (y1, ..., yn)), or other mathematical notations. This project is about developing programs to process documents to gather samples of the various forms of these and similar notations.

Generic project contributing to the Skalpel type error explainer
Joe Wells
Edinburgh
Skalpel helps explain type errors in computer programs. Although there are a number of specific projects listed for Skalpel, there are lots and lots of other possible projects, far too many to write a project proposal for each one. Just ask.

implement constraint solving for type inference for System Fs
Joe Wells
Edinburgh
System F is a type system that is embedded as part of the essential core in type systems used by many programming language and proof systems. The key idea of System F is the forall-quantified type, e.g., the type (forall x. x to x) stands for the collection of all types of the shape (Z to Z) for any Z. System Fs extends System F with _expansion variables_ to enable a particular way of using constraint solving for finding types for programs and proof skeletons with incomplete type information. This project is about implementing the key features of System Fs and exploring possible constraint solving algorithms.

investigate a lambda-calculus-like machine/assembly language
Joe Wells
Edinburgh
The lambda-calculus is the standard theory for reasoning about computer programs. Machine language is what available CPUs actually run. This project involves investigating a machine-language-like formalism with the equational reasoning power of the lambda calculus. Useful tasks that might be part of the project include implementing the language and testing or verifying its properties.

Making the Skalpel type error explainer more robust
Joe Wells
Edinburgh
Skalpel helps explain type errors in computer programs. The project would improve testing, find bugs, and improve robustness. Much earlier work has been on theoretical challenges, with less time spent on niceties like, for example, good error messages and test coverage. The project might also finish moving the web site to GitHub.

parsing/semantics for mixed English/symbolic mathematics
Joe Wells
Edinburgh
Parse and/or give formal semantics to mathematical uses of English combined with symbolic formulas. The starting point is to read the Language of Mathematics by Mohan Ganesalingam and then scan the research in the decade since this book was published. Then some part of the problem must be determined as the goal. Then implement and test and evaluate.

Reimplementing the user interface of a type error explainer
Joe Wells
Edinburgh
Skalpel helps explain type errors in computer programs. Programmers of the front-end (user interface) have included me, 2 PhD students, and 5 project students, with the result being code that is fragile and hard to modify. A good project would be to rewrite it with proper care for data structure sanity, error checking, error messages, testing, etc.

Toward type error explanations for the Hume language
Joe Wells
Edinburgh
Hume (http://hume-lang.org/) is a language using ideas from both functional programming and finite automata together with strong types to obtain guarantees on time and space usage for safety-critical systems. The project would begin the process of extending the type error explainer Skalpel so it can find the portion of a Hume program responsible for a type error. Most likely only part of Hume will be handled. This would also begin the process of extending Skalpel to analyze multiple languages.

Visualizing type errors with graphical type/data-flow diagrams
Joe Wells
Edinburgh
Skalpel helps explain type errors in computer programs. The project would extend Skalpel's back-end to generate graphical type/data-flow diagrams that will show how the program parts causing a type error are connected. Tom Methven (RA for Mike Chantler) can help a bit. Tom and Mike recommend the D3JS library (http://d3js.org/).

open source implementation of PDF reader extensions for data capture
programming open source graphical user interfaces document standards forms data capture
Joe Wells
Edinburgh
PDF is now much more than a system for arranging ink marks on paper. PDF now includes many new features from 3D visualizations that can be manipulated to dynamic adaptation to changes in media size and shape. One particularly important feature is fancy form filling with programmable checking of entered data. It is particularly important for there to be an open source implementation of these features, because they are often used for mandatory government reporting and it is not good for this kind of functionality to be controlled by a private company. This project aims to assess which parts of the standards in this area that Adobe has put forward are most important to implement as open source, and then to carry out and test and deliver some specific improvements to some specific pieces of open source software.

Distributed Ledger-based Identity Management System
distributed ledger blockchain identity management
Timothy Yap
Malaysia
Distributed ledger promotes decentralisation and immutability of stored data. Taking this further, the ledger can also be used to empower individuals to have control over their identities and provide credentials in a minimised trust environment. This project requires a distributed ledger or blockchain-based identity management system to be set up, and its capability to manage and provision of credentials to be demonstrated.

Customer Feedback Analysis using Aspect-Based Sentiment Analysis
machine learning sentiment analysis deep learning transformer
Timothy Yap
Malaysia
Aspect-based sentiment analysis (ABSA) is a more detailed approach to sentiment analysis that goes beyond determining the overall sentiment of a piece of text. It focuses on identifying specific aspects or features of a product, service, or entity mentioned in the text and determining the sentiment expressed towards each of these aspects. This project aims to develop and evaluate a refined Aspect-Based Sentiment Analysis (ABSA) system to analyze customer feedback for a specific product or service domain. The goal is to provide detailed insights into customer opinions on various aspects of the product/service, enabling businesses to understand and address specific customer concerns more effectively.

Investigating the Network Characteristics of Blockchain Networks
data analytics network analysis blockchain bitcoin
Timothy Yap
Malaysia
This project aims to study the network characteristics of blockchain networks, with a specific focus on the Bitcoin network. The research will cover aspects such as data propagation, peer-to-peer (P2P) communication, and node churn. The project will involve both analytical and statistical methods, along with practical experiments such as setting up nodes or simulations.

Empowering Scientific Research with Graph Neural Networks and Real-World Applications
machine learning deep learning graph neural networks
Yingfang Yuan
Edinburgh
This project centers on Graph Neural Networks (GNNs) and their applications across a range of disciplines. While the primary focus is on developing and implementing GNNs, specific research problems are flexible and can be tailored to each student’s interests and academic background. Students have the opportunity to explore diverse applications of GNNs, from uncovering complex patterns in financial data to enhancing understanding in biological networks, social sciences, and beyond. If you have any questions or ideas, please feel free to reach out via email.

Visualising Data of Lift Usage at the HW Dubai campus
Hind Zantout
Dubai
Working with the relevant persons at HW facilities management, create a possibly real-time application of lift usage.

Coursework Submission Deadline Visualiser System
Hind Zantout
Dubai
This is a software engineering project where requirements need to be collected from both staff and students. It may be split into two separate projects. From the student perspective, it should help student meet the submission deadline. From the staff perspective, it should visualise the impact of the change in one deadline. Usability is also important.

Finding schools in Dubai
Hind Zantout
Dubai
KHDA has open data with details of schools in Dubai.This project can help parents decide which school to send their children too. The deliverable is a 'website' that parents can use. Computer systems students can focus of the development aspect of such a project, Computer Science student can include the topic of visualisation. Both cohorts can explore the inclusion of analytics.

Generic Topic
Hind Zantout
Dubai
This is a project for building a website or an app with functionality to be determined based on the context.

Isolation beating App
Hind Zantout
Dubai
There is a wide offering of social media. This project will investigate the strength and weaknesses of each platform and look into which features are best suited to overcome isolation, then develop an app that incorporates the important features. A colleague from psychology can be consulted.

Library Champion
Hind Zantout
Dubai
The Library if full of valuable references which can help in the various courses. Starting with the course descriptor of a course, identify available reference, and whenever such a reference is consulted, it gets rated. Think: "Tripadvisor for books".

Monitoring Online reading
Hind Zantout
Dubai
In this project you will at random replace very short words such as in, the, of, at... with a number of blanks equivalent to number of letters in the word. As the user enters these correctly, proving that the user is reading the text, the number of blanked words can reduce. The text could be the honours project student handbook. There is scope to add additional features such as gamification or visualisation.

Student Proposed Open Project
Hind Zantout
Dubai
Student-proposed project

Smart Anything
Hind Zantout
Dubai
A range of applications such as - leveraging the availability of devices such as smart meters to reduce consumption of electricity in the home -leveraging the availability of open data to create smart communities and many more.

Visualising/Analysing Health Data
Hind Zantout
Dubai
This project can look into analysing health-related data. For visualisation it could include the development of an animation to track the progress of e.g. Covid in one country, or it could visualise a region and map one country within that region, or compare countries with similar size populations or similar climates or geographic locations.

Visualising Speech
Hind Zantout
Dubai
Research what keeps an audience captivated and feed back to a public speaker via an app.

Tracking student progress
Hind Zantout
Dubai

A project around digital art
Hind Zantout
Dubai
Here are two links to explore and https://journals.ub.uni-heidelberg.de/index.php/dah/article/view/21631 or perhaps https://towardsdatascience.com/tagged/digital-art The Psychology department have recently acquired a mobile eye tracker with a capability to be incorporated with a VR headset. They are looking to using it for some eye movement research related to digital/ immersive art (Dr. Pik Ki Ho https://www.hw.ac.uk/dubai/profiles/teaching/dr-pik-ki-ho.htm will be co-supervisor)

Graph Database application
Hind Zantout
Dubai

Student Attendance
Hind Zantout
Dubai
Students in a class are given 'green' numbers in the range 2-163. At the end of the session, the lecturer notes the number on a sheet of paper which is then scanned and the numbers added to an Excel sheet (or any other suitable file) thus forming an attendance record. The student name and H number should also be considered.

Image Classification Distinguishing Real vs. Fake Images
machine learning ai
Claudio Zito
Dubai

Reinforcement Learning for Autonomous Agents
Claudio Zito
Dubai
We will use the OpenAi gym as platform for training and evaluation

ML approach to biomarkers detection and discovery
Claudio Zito
Dubai
Exploratory Data Analysis on published datasets containing gene expressions for blood tumor patients. Development of ML models to identify possible biomarkers for earlier detection of the disease.