Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

On Optimal Steering to Achieve Exact Fairness

Created by
  • Haebom

Author

Mohit Sharma, Amit Jayant Deshpande, Chiranjib Bhattacharyya, Rajiv Ratn Shah

Outline

This paper presents a method to adjust the feature distribution of data or internal representations of large-scale language models (LLMs) to an ideal distribution that guarantees fair outcomes across groups to address the "biased input, biased output" problem in fair machine learning. The ideal distribution is defined as one in which a minimizer for a cost-sensitive risk achieves accurate group fair outcomes (e.g., demographic equality, equal opportunity). In other words, there is no fairness-utility trade-off. We formulate an optimal adjustment program that finds the ideal distribution closest to the KL-divergence and provide an efficient algorithm when the underlying distribution comes from a well-known parametric population (e.g., normal distribution, log-normal distribution). We experimentally validate the optimal adjustment technique on synthetic and real-world datasets, demonstrating that it improves fairness without compromising usability (and sometimes even improving it). Affine adjustment of the LLM representation reduces bias in multi-class classification (e.g., job prediction from short biographies in the Bios dataset). We also adjust the LLM's internal representation to the desired output to ensure uniform performance across diverse groups.

Takeaways, Limitations

Takeaways:
We present an optimal distribution adjustment technique using KL-divergence and show that fairness can be improved without fairness-usefulness tradeoff.
Demonstrate practicality by validating effectiveness on synthetic and real-world datasets.
Suggesting ways to improve fairness for diverse groups by adjusting the internal representation of LLM.
Limitations:
The efficiency of the proposed algorithm is guaranteed when the underlying distribution is a well-known parametric group. Further research is needed to determine its generalizability to other distributions.
The experimental results may be limited to a specific dataset and task, so further research is needed to determine their generalizability to other situations.
Because the definition of "ideal distribution" depends on a specific concept of fairness, its applicability to other definitions of fairness needs to be examined.
👍