Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models

Created by
  • Haebom

Author

Jianyi Zhang, Da-Cheng Juan, Cyrus Rashtchian, Chun-Sung Ferng, Heinrich Jiang, Yiran Chen

Outline

This paper proposes Self Logits Evolution Decoding (SLED), a novel decoding framework for improving the output reliability and factual accuracy of large-scale language models (LLMs). SLED leverages latent knowledge within the LLM to improve the factual accuracy of the output, without requiring an external knowledge base or additional fine-tuning. It compares the output logits of the final and initial layers and uses an approximate gradient approach to allow the latent knowledge to self-improve the output. Extensive experiments on various model families and sizes (1B to 45B), including Gemma, Qwen, Mixtral, and gpt-oss, as well as advanced architecture configurations such as MoE, demonstrate that SLED consistently improves factual accuracy compared to existing decoding methods while maintaining natural language fluency and incurring negligible latency overhead. Furthermore, it can be flexibly combined with other decoding methods to further enhance performance.

Takeaways, Limitations

Takeaways:
We present a novel decoding method that improves the factual accuracy of LLM without external knowledge bases or additional fine-tuning.
It is applicable to various model architectures and sizes and shows better performance than existing methods.
Performance can be improved by combining it with other decoding methods.
Maintain natural language fluency and minimize latency overhead.
Limitations:
Further verification of the generalizability of the experimental results presented in this paper is needed.
Further research is needed to determine whether the performance-enhancing effects of SLED are consistent across all types of LLM and across all tasks.
Further analysis is needed to address the potential performance degradation due to limitations of the approximate gradient approach.
👍