Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

FairSHAP: Preprocessing for Fairness Through Attribution-Based Data Augmentation

Created by
  • Haebom

Author

Lin Zhu, Yijun Bian, Lei You

FairSHAP: A Fairness Improvement Preprocessing Framework Based on Shapley Values

Outline

This paper introduces FairSHAP, a novel preprocessing framework that leverages Shapley values to ensure fairness in machine learning models. FairSHAP uses an interpretable feature importance measure based on Shapley values to identify instances in training data that cause unfairness and systematically corrects them through instance-level matching across sensitive groups. This process improves individual fairness metrics, such as discriminatory risk, while preserving data integrity and model accuracy. FairSHAP significantly improves demographic parity and equality of opportunity across diverse tabular datasets, achieves fairness improvements with minimal data variation, and in some cases, improves predictive performance. FairSHAP is model-independent and transparent, easily integrated into existing machine learning pipelines, and provides actionable insights into the causes of bias.

Takeaways, Limitations

Takeaways:
An interpretable preprocessing framework for ensuring fairness by leveraging Shapley values is presented.
Reduce the risk of discrimination by identifying and correcting data instances that cause unfairness.
Demonstrated effectiveness in improving demographic parity and equal opportunity.
Maintaining data integrity and model accuracy.
Model-independent and easy to integrate into existing pipelines.
Provides insight into the causes of bias.
Limitations:
The paper alone lacks detailed information about specific datasets, performance improvements, and algorithm complexity.
Only GitHub links are provided, so accessibility to information needed for actual implementation and application may be limited.
The effectiveness of FairSHAP may be limited to certain datasets or problem types.
👍