Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Enhancing Diversity in Large Language Models via Determinantal Point Processes

Created by
  • Haebom

Author

Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Ioannis Ch. Paschalidis, Aldo Pacchiano

Outline

This paper highlights the problem that supervised fine-tuning and reinforcement learning, as post-training methods for large-scale language models (LLMs), contribute to improved model performance, but reduce output diversity, leading to narrow and typical responses. Existing diversity-enhancing methods have limitations, operating at inference time or focusing solely on lexical differences. In response, this paper proposes DQO, a novel training method based on the Decision Point Process (DPP). DQO samples and embeds multiple responses for each prompt, measuring diversity by measuring the volume occupied by these response embeddings. Experiments on various tasks (direction following, summarization, story generation, and inference) demonstrate that DQO significantly improves semantic diversity without compromising model quality.

Takeaways, Limitations

Takeaways:
We present a novel training method (DQO) that simultaneously optimizes the quality and semantic diversity of LLMs by utilizing the decision point process (DPP).
Overcoming the limitations of existing methods, such as focusing on inference-time operation or lexical differences.
Demonstrated effectiveness in maintaining model quality while improving semantic diversity across various tasks.
Limitations:
The computational complexity of DPP-based diversity measurement methods can be high.
There may be a dependency on the use of a specific type of kernel.
Further research is needed on the generalization performance of the proposed method.
👍