Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation

Created by
  • Haebom

Author

Keunwoo Choi, Seungheon Doh, Juhan Nam

TalkPlayData 2: Multimodal Conversational Music Recommendation

Outline

This paper presents TalkPlayData 2, a synthetic dataset for multimodal conversational music recommendation, generated via an agent-based data pipeline. In the proposed pipeline, multiple large-scale language model (LLM) agents with various roles are generated, each with access to specialized prompts and information. The conversation data is obtained by recording conversations between the Listener LLM and the Recsys LLM. To address diverse conversational scenarios, the Listener LLM is conditioned on fine-tuned conversational objectives for each conversation. Ultimately, all LLMs are multimodal, containing both audio and images, enabling multimodal recommendation and conversation simulation. In LLM-as-a-judge and subjective evaluation experiments, TalkPlayData 2 achieves the proposed goals across various aspects related to training a generative recommendation model for music. TalkPlayData 2 and the generation code are open-sourced under https://talkpl.ai/talkplaydata2.html에서 .

Takeaways, Limitations

Generating a multimodal conversational music recommendation dataset using an agent-based pipeline.
Use fine-tuned conversation objectives to cover a variety of conversation scenarios.
Verifying goal achievement through LLM-as-a-judge and subjective assessments
Open source dataset and generation code provided
Specific Limitations is not specified in the summary.
👍