Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation

Created by
  • Haebom

Author

Hongxiang Zhang, Hao Chen, Muhao Chen, Tianyi Zhang

Outline

This paper proposes a novel decoding strategy, Active Layer-Contrastive Decoding (ActLCD), to address the hallucination problem in large-scale language models (LLMs). Unlike existing token-based methods, ActLCD uses a reinforcement learning-based policy to dynamically determine when to apply contrastive learning layers during the generation process. This approach views the decoding process as a sequential decision-making problem and optimizes fact accuracy through a reward-aware classifier. Experimental results show that ActLCD outperforms state-of-the-art methods across five benchmarks.

Takeaways, Limitations

Takeaways:
We present a new approach to solving the problem of factual errors in LLM.
Beyond the limitations of a token-level approach, it enables fact verification that takes the entire context into account.
Increased efficiency through dynamic layer selection based on reinforcement learning.
It showed excellent performance in various generation scenarios.
Limitations:
The computational cost of the proposed method may be higher than existing methods.
The performance of ActLCD may be affected by the performance of the reward recognition classifier.
Generalization performance may be poor when trained with a dataset biased towards a specific domain.
Generating long text may increase computational complexity and decrease performance.
👍