Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Personalized LLM Decoding via Contrasting Personal Preferences

Created by
  • Haebom

Author

Hyungjun Bu, Chanjoo Jung, Minjae Kang, Jaehyung Kim

Outline

As large-scale language models (LLMs) are increasingly used in diverse real-world applications, personalization of LLMs is becoming increasingly important. While various LLM personalization approaches, including prompt-based and learning-based methods, have been actively studied, the development of effective decode-time algorithms has been overlooked despite their potential. In this paper, we propose Contrasting Personal Preference (CoPe), a novel decode-time approach applied after parameter-efficient fine-tuning (PEFT) on user-specific data. The core idea is to maximize each user's implicit reward signal, specifically leveraging reward-guided decoding for personalization. We evaluate CoPe across five open-ended personalized text generation tasks. Experimental results show that CoPe achieves robust personalization performance on ROUGE-L, with an average improvement of 10.57%, without relying on an external reward model or additional training procedures.

Takeaways, Limitations

Takeaways:
A novel personalized method proposed for decoding time after PEFT (CoPe).
Achieved powerful personalized performance with 10.57% improvement over ROUGE-L.
No external compensation models or additional training procedures required.
Limitations:
No specific Limitations mentioned in the paper.
👍