Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Created by
  • Haebom

Author

Weijie Xu, Yiwen Wang, Chi Xue, Xiangkun Hu, Xi Fang, Guimin Dong, Chandan K. Reddy

Outline

Large-scale language models (LLMs) often generate responses with inherent biases, compromising their reliability in real-world applications. Existing evaluation methods often overlook the inherent biases in long-form responses and the inherent variability in LLM output. To address these challenges, this paper proposes Fine-Grained Semantic Comparison (FiSCo), a novel statistical framework for assessing group-level fairness in LLMs by detecting subtle semantic differences in long-form responses across demographic groups. Unlike previous studies that focus on sentiment or token-level comparisons, FiSCo analyzes responses at the semantic level by leveraging implication checks to assess semantic consistency. It decomposes model outputs into semantically distinct claims and applies statistical hypothesis testing to compare between- and within-group similarities, enabling robust detection of subtle biases. We formalize a novel definition of group counterfactual fairness and validate FiSCo on synthetic and human-annotated datasets that include gender, race, and age. Experimental results demonstrate that FiSCo outperforms various evaluation metrics in identifying subtle biases more reliably while mitigating the impact of stochastic LLM variability.

Takeaways, Limitations

Takeaways:
Proposing a new statistical framework, FiSCo, for assessing group-level fairness in LLMs.
Solving the problems of bias in long responses and variability in LLM output, which are limitations of existing methods.
Detect subtle bias and enable robust evaluation through semantic level analysis.
A New Definition of Group Counterfactual Fairness
Conduct experimental validation across various demographic groups, including gender, race, and age.
Confirmed superior performance compared to existing evaluation indicators
Limitations:
FiSCo's performance may depend on the dataset used and the quality of the annotations.
It is possible that not all types of bias can be perfectly detected.
Computational costs may be high.
Further research is needed on the generalizability of the new group counterfactual fairness definition.
👍