View Proposal
-
Proposer
-
Daniel Hernandez Garcia
-
Title
-
Multi-party Social Robot Interactions with LLMs
-
Goal
-
Verify the extent to which multi-party behaviours, we would normally adopt in human-human interactions, can be performed effectively with an LLM integrated in a Social Robot.
-
Description
- The ability of Socially Assistive Robots (SARs) to handle dialogue with multiple people at the same time is critical to their adoption in public spaces. Tasks that are typically trivial in one-to-one interactions become considerably more complex when multiple users are involved [1, 2].
Taking part in social interactions with multiple participants, i.e. more than two, constitutes a challenging task for an autonomous system to manage. In these situations the system must interpret and understand different social cues from multiple people at the same time while also employing proper social strategies for addressing different users and regulating the interaction.
Building multi-party conversations systems present challenges that do not exist in dyadic conversations, since the structure of the dialogue context is more complicated and the generated responses relies heavily on both interlocutors (i.e., speaker and addressee) and the history of the conversation \cite{gu2022hetermpc}. For multi-party human-robot interactions, turn-taking and the recognition of speakers and addressees remain an open challenge [4]. The work of Skantze [5] provides an overview of research in modelling turn-taking, including end-of-turn detection, handling of user interruptions, and generation of turn-taking cues with voice assistants and social robots.
The use of LLMs also holds significant promise for improving HRI [6]. The main goal of our work is the development of a multi-party conversational system that would allow situated social interactions involving a robot and multiple users. To do so, we will want to connect the language understanding capabilities of an LLM with a robot's multi-modal perception (audio and visual) and action generation capabilities. The project will seek to evaluate performance of a multi-party system in a user evaluation study following the experimental methodology that was designed by [7].
[1] D. Traum, “Issues in multiparty dialogues,” in Advances in Agent Communication: International Workshop on Agent Communication Languages, ACL 2003, Melbourne, Australia, July 14, 2003.
[2] “WWHO Says WHAT to WHOM: A Survey of Multi-Party Conversations,” https://www.ijcai.org/proceedings/2022/768
[3] “HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations,” https://arxiv.org/abs/2203.08500
[4] K. Mahajan and S. Shaikh, “On the need for thoughtful data collection for multi-party dialogue: A survey of available corpora and collection methods,” https://aclanthology.org/2021.sigdial-1.36
[5] Turn-taking in Conversational Systems and Human-Robot Interaction: A Review https://www.sciencedirect.com/science/article/pii/S088523082030111X#bib0118
[6] “Understanding large-language model (llm)-powered human-robot interaction” https://doi.org/10.1145/3610977.3634966
[7] N. Gunson, A. Addlesee, D. Hernandez Garcia, M. Romeo, C. Dondrup, and O. Lemon, “A holistic evaluation methodology for multi-party spoken conversational agents,” in ACM International Conference on Intelligent Virtual Agents (IVA ’24). https://researchportal.hw.ac.uk/en/publications/a-holistic-evaluation-methodology-for-multi-party-spoken-conversa
- Resources
-
-
Background
-
-
Url
-
-
Difficulty Level
-
Challenging
-
Ethical Approval
-
Full
-
Number Of Students
-
2
-
Supervisor
-
Daniel Hernandez Garcia
-
Keywords
-
-
Degrees
-