Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data

Created by
  • Haebom

Author

Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh Murthy, Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, Silvio Savarese, Huan Wang, Caiming Xiong, Shelby Heinecke

Outline

This paper highlights the importance of personalization in the context of personalized AI assistants, particularly private AI models that leverage private user data. We focus on evaluating the ability of AI models to access and interpret users' private data (e.g., conversation history, user-AI interactions, app usage history) to understand users' personal information (e.g., biographical information, preferences, social relationships, etc.). Recognizing the limited availability of publicly available datasets due to the sensitive nature of these data, we present a synthetic data generation pipeline that generates private documents that simulate diverse and realistic user profiles and personal activities. Building on this, we propose a benchmark, PersonaBench, to evaluate the performance of AI models that understand private information extracted from simulated private user data. Using a Retrieval-Augmented Generation (RAG) pipeline, we evaluate the performance of AI models that understand private information extracted from simulated private user data. Our results reveal that current RAG-based AI models struggle to extract personal information from user documents and answer private questions, highlighting the need for improved methodologies to enhance AI's personalization capabilities.

Takeaways, Limitations

Takeaways: We present a new benchmark (PersonaBench) for evaluating the personalization capabilities of AI models utilizing private personal data. By revealing the limitations of current RAG-based AI models' ability to understand personal information, we suggest future research directions. We also present a method for generating data similar to real data while addressing privacy concerns through a synthetic data generation pipeline.
Limitations: PersonaBench is based on synthetic data, so it may differ from evaluations using real user data. Because this evaluation is limited to the RAG pipeline, further research is needed to evaluate other types of AI models. It may not fully reflect the diversity and complexity of personal information.
👍