Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse

Created by
  • Haebom

Author

Alkis Kalavasis, Anay Mehrotra, Grigoris Velegkas

Outline

We study whether, when a language model is trained on samples from an unknown language K, it can generate valid strings that have not been trained on, capturing the full richness of the language. We question whether consistent and breadth-based language generation (where the model output converges to all unseen strings in K as the training data grows) is possible, and show that this is impossible for a wide range of language models, including next-token prediction models. We also demonstrate that consistent and breadth-based generation is possible when provided with negative examples (strings outside K).

Takeaways, Limitations

Takeaways:
Generating language with both consistency and breadth is challenging for many language models, including next-token prediction models.
Language production with breadth is fundamentally different from production without breadth.
Feedback, including negative examples, can be important in reducing hallucinations and limiting mode collapse.
Proximity bounds on the number of samples for consistency and breadth are set.
Limitations:
Specifies the range of models for which consistent and broad language generation is impossible.
No specific details are provided regarding the implementation and effectiveness of the proposed solution (negative example).
Further research is needed to determine how the results of this study can be applied to improve the performance of real-world language models.
👍