Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Analyzing Character Representation in Media Content using Multimodal Foundation Model: Effectiveness and Trust

Created by
  • Haebom

Author

Evdoxia Taka, Debadyuti Bhattacharya, Joanne Garde-Hansen, Sanjay Sharma, Tanaya Guha

Outline

This paper presents an AI-based character expression analysis tool and evaluates its usability and reliability through user research. Using an analysis extraction model based on Contrastive Language Image Pretraining (CLIP), the tool quantifies character expressions by gender and age from image data and includes visualization components that effectively display these results to general users. User research results revealed that participants understood the visualized analysis results and recognized the tool's overall usability, but they also highlighted the need for visualizations that included more detailed demographic categories and contextual information. While confidence in the AI-based gender and age models was moderate to low, there was no opposition to their use. The tool code, benchmarking, and user research data are available on GitHub.

Takeaways, Limitations

Takeaways:
Empirically verifying the usefulness of an AI-based character expression analysis tool.
Emphasize the importance of visualizing AI-based analysis results for general users.
Suggesting further research directions to improve the reliability of AI models and meet user needs.
Limitations:
Limitations in participant size and diversity in user research.
The demographic categories included in the analysis are limited.
Confidence in AI models is found to be medium to low.
Identify the need for more detailed visualization and contextual information.
👍