This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
Advances in large-scale language models (LLMs) have enabled human-like social simulations at unprecedented scale and fidelity. However, constructing persona sets that authentically represent the diversity and distribution of real-world populations remains a critical challenge. In this paper, we propose a systematic framework for synthesizing high-quality, population-aligned persona sets for LLM-based social simulations. This framework begins by leveraging LLMs to generate narrative personas from long-term social media data and filtering out low-fidelity profiles through quality assessment. Importance sampling is then applied to achieve global alignment to reference psychometric distributions, such as the Big Five personality traits. To address the needs of specific simulation contexts, we add task-specific modules that apply the globally aligned persona sets to target subpopulations. Extensive experiments demonstrate that our methodology significantly reduces population-level bias and enables accurate and flexible social simulations with broad research and policy applications.
Takeaways, Limitations
•
Takeaways:
◦
We present a framework for synthesizing high-quality, population-aligned persona sets for LLM-based social simulation.
◦
Create realistic personas using social media data.
◦
Demographic sorting through importance sampling
◦
Adapting persona sets to specific simulation contexts
◦
Reduced population-level bias and accurate social simulations possible
•
Limitations:
◦
Specific Limitations is not stated in the paper (assumed to be a summary)