Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Created by
  • Haebom

Author

Weijie Xu, Yiwen Wang, Chi Xue, Xiangkun Hu, Xi Fang, Guimin Dong, Chandan K. Reddy

Outline

Large-scale language models (LLMs) often produce responses with inherent biases, which makes them unreliable in real-world applications. Existing evaluation methods often overlook the biases in long-form responses and the inherent variability in LLM outputs. To address these challenges, in this paper we propose a novel statistical framework, Fine-grained Semantic Computation (FiSCo), to assess group-level fairness in LLMs by detecting subtle semantic differences in long-form responses across demographic groups. Unlike previous studies that focus on sentiment or token-level comparisons, FiSCo performs semantic unit analysis by leveraging implication checks to assess the consistency of meaning across responses. It decomposes model outputs into semantically distinct claims and applies statistical hypothesis testing to compare between- and within-group similarities to robustly detect subtle biases. We formalize a novel definition of group counterfactual fairness and validate FiSCo on synthetic and human-annotated datasets spanning gender, race, and age. Experimental results show that FiSco more reliably identifies subtle biases while reducing the impact of probabilistic LLM variability, outperforming various evaluation metrics.

Takeaways, Limitations

Takeaways:
We present FiSCo, a novel statistical framework for detecting subtle semantic biases in long-form responses to LLMs.
It can evaluate group-level fairness of LLM more accurately and reliably than existing methods.
We use implicit verification to perform semantic unit analysis beyond surface analysis.
We present a new definition of group counterfactual fairness.
We validate the effectiveness of FiSCo on synthetic and human-annotated datasets.
Limitations:
The performance of FiSCo may depend on the dataset used and the quality of the annotations.
Further research is needed to determine whether all types of bias can be captured.
It may be computationally expensive.
Further research is needed to explore generalizability across different LLM architectures and applications.
👍