Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Vision-Language Models display a strong gender bias

Created by
  • Haebom

Author

Aiswarya Konavoor, Raj Abhijit Dandekar, Rajat Dandekar, Sreedath Panat

Outline

This paper investigates how a Vision-Language Model (VLM) reflects gender stereotypes through embeddings between facial images and phrases describing occupations and activities. Using a dataset of 220 gender-differentiated facial images and 150 phrases (in six categories: emotional labor, cognitive labor, domestic labor, technical labor, professional labor, and manual labor), we compute a gender relevance score by calculating the difference in cosine similarity between phrases for male and female image embeddings. We present a robust framework for assessing gender bias by calculating confidence intervals using bootstrapping and comparing these results to the expected value in the absence of gender structure. Consequently, we provide gender relevance maps for phrases and categories.

Takeaways, Limitations

Takeaways:
A robust framework for quantitatively assessing whether VLM reflects gender stereotypes is presented.
Analyze gender associations with occupations and activities by category to reveal specific biases.
By revealing potential gender bias in VLM, we highlight the importance of developing fair and ethical models.
Limitations:
The size of the facial image and text dataset used may be relatively limited.
Classifications based on perceived gender can be subjective and may not match actual gender.
The analysis results may be limited to a specific VLM architecture and may have limited generalizability to other models.
Analysis may be required for occupations and activities other than the six categories used in the analysis.
👍