View Proposal
-
Proposer
-
Simona Frenda
-
Title
-
Catching cultural biases in LLMs
-
Goal
-
Design a framework of calibration of LLMs biases employing statistical techniques like conformal prediction.
-
Description
- Conformal prediction is a statistical framework used to quantify the reliability of model predictions by assessing how well individual predictions align with a set of labels (e.g., labels annotated by a group of persons belonging to the same culture) [1].
Scientific literature has revealed important biases coming from data collected and annotated by specific segments of the population, leading to the creation of non-neutral models [2] and to the reinforcement of social stereotypes [3].
To investigate societal biases of LLMs we can rely on conformal prediction to evaluate the uncertainty of models against the decisions of specific groups of annotators with similar backgrounds, cultures, or beliefs. We will use multiple annotated datasets addressing high subjective phenomena such as the detection of misinformation, irony, hate speech.
- Resources
-
[1] Angelopoulos, A. N., & Bates, S. (2023). Conformal prediction: A gentle introduction. Foundations and trends® in machine learning, 16(4), 494-591.
[2] Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023, July). Whose opinions do language models reflect?. In International Conference on Machine Learning (pp. 29971-30004). PMLR.
[3] Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
-
Background
-
https://www.nowpublishers.com/article/Details/MAL-101
-
Url
-
-
Difficulty Level
-
Challenging
-
Ethical Approval
-
None
-
Number Of Students
-
0
-
Supervisor
-
Simona Frenda
-
Keywords
-
bias detection, calibration of models, multiple annotated corpora
-
Degrees
-
Master of Science in Artificial Intelligence
Master of Science in Artificial Intelligence with SMI
Master of Science in Data Science
Bachelor of Science in Statistical Data Science