View Proposal


Proposer
Ioannis Konstas
Title
NLP - Question Answering using Large Language Models
Goal
To develop a faithful question answering system with an LLM
Description
You have most likely already used ChatGPT a few times (or a lot!) Have you ever wondered what it actually takes to build a system based on a Large Language Model (LLM) and evaluate it on a real-world task? In this series of projects (check the rest as well!) we will explore the task of question answering (QA). QA involves trying to automatically answer a user query usually "grounded" on one or more passages (what we refer to as open-book QA) or just relying on the acquired knowledge of the model (closed-book QA). The tricky part (especially with the recent advances in LLMs) is to ensure that the output question is faithful to the user query (and the input passage, if that exists) and not to hallucinate extra pieces of irrelevant or wrong information. There are many interesting aspects in the field of Question Answering that we could potentially explore: - Choose between open-book and closed-book QA. The former involves the component of retrieving the relevant information; this could be either via a search engine (e.g., via a Google search query), or a retrieval engine. This usually depends on the breadth of the domain we will choose to focus on (generic knowledge QA vs. closed-domain questions based for example on the scrape of a website only) - Experiment with different styles of outputs: Provide answers that are less or more verbose, or that exhibit particular linguistic or other rhetoric phenomena (such as repeating part of the question, providing a definition first, giving an explanation, etc.) - Explore more challenging queries, that might require some sort of decomposition reasoning, e.g., complex questions: "How old was Linus Torvalds when Linux 1.0 was released?", "What is the difference between jam and marmalade?" - Experiment with more than single-shot questions, resembling a natural conversation. Once we decide on the particular flavour of QA task we are interested in, then we can explore several popular techniques for fine-tuning an open-source LLM (e.g., Llama 2) starting from the simpler ones (prompt engineering), all the way up to Parameter Efficient Fine-Tuning (PEFT). We will use standard benchmark datasets and SOTA frameworks to evaluate (and potentially train) our models. We can co-develop the project to emphasise more on the features (faithfulness, style of output, complex question, conversational QA), training, data annotation, or human evaluation.
Resources
Large Language Models, GPU
Background
Machine Learning, NLP (desired: F29AI), software development, F21NL (MSc only), F21CA (MSc only)
Url
Difficulty Level
Variable
Ethical Approval
None
Number Of Students
4
Supervisor
Ioannis Konstas
Keywords
machine learning, neural networks, large language models, natural language processing (nlp)
Degrees
Bachelor of Science in Computer Science
Master of Science in Artificial Intelligence
Master of Science in Artificial Intelligence with SMI