Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

PakBBQ: A Culturally Adapted Bias Benchmark for QA

Created by
  • Haebom

Author

Abdullah Hashmat, Muhammad Arham Mirza, Agha Ali Raza

PakBBQ: A biased benchmark for culture and region

Outline

This paper introduces PakBBQ, a low-resource, low-context, large-scale language models (LLMs) designed to ensure fairness and address the shortcomings of low-resource, low-context, and low-context linguistic and regional contexts. PakBBQ is a culturally and regionally-specific extension of the original BBQ (Bias Benchmark for Question Answering) dataset, containing over 214 templates and 17,180 question-answer (QA) pairs in English and Urdu across eight bias dimensions relevant to Pakistan: age, disability, appearance, gender, socioeconomic status, religion, regional affiliation, and linguistic formality. We evaluate a variety of multilingual LLMs under ambiguous and explicitly context-specific settings, and under negative and positive question framing. Experimental results demonstrate an average accuracy improvement of 12% with context-specification, consistently stronger anti-bias behavior in Urdu than in English, and a reduction in stereotypical responses with negative question framing.

Takeaways, Limitations

Takeaways:
We highlight the importance of context-sensitive benchmarks and demonstrate the effectiveness of simple prompt engineering strategies for bias mitigation in low-resource environments.
Demonstrates the need to build culturally and regionally specific datasets for assessing bias in multilingual LLMs.
This suggests that LLM bias may be less pronounced in low-resource languages such as Urdu.
We show that question framing (positive/negative) can influence LLM's response bias.
Limitations:
The specific Limitations was not explicitly mentioned in the paper. (Based solely on the abstract of the paper)
👍