[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in Product QA Agents

Created by
  • Haebom

Author

Ashley Lewis, Michael White, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang

Outline

In this paper, we propose a retrieval-based question answering (QA) pipeline to address the problems of hallucination (generated false information) and high cost of proprietary models that limit the utilization of large-scale language models (LLMs) in customer support areas, and explore the balance between human intervention and automation. Using a dataset of questions about Samsung smart TV user manuals, we show that synthetic data generated by LLMs are more effective in reducing hallucination of fine-tuned models than crowdsourced data. We also compare self-learning (fine-tuning with the output of the model) with knowledge distillation (fine-tuning with the output of a more powerful model, e.g., GPT-4), and confirm that self-learning achieves similar hallucination reduction effects, which we speculate is due to increased exposure bias in the case of knowledge distillation, which we support with additional analysis. In addition, we improve the robustness against unanswerable questions and retrieval failures by providing context-sensitive “I don’t know” responses. These results demonstrate that synthetic data and self-learning using open-source models can be used to build scalable and cost-effective QA systems, reducing the reliance on proprietary tools or expensive human annotations.

Takeaways, Limitations

Takeaways:
When building an LLM-based customer support system, we suggest that self-learning using synthetic data is cost-effective and effective in reducing hallucination problems.
Suggests the possibility of reducing reliance on proprietary models through open source models and self-learning.
We present a method to improve the robustness of the system by providing context-sensitive “I don’t know” responses.
Limitations:
The dataset used was limited to Samsung Smart TV user manuals, requiring further research on generalizability.
Further in-depth analysis is needed to determine why knowledge distillation was less effective than expected.
Performance evaluation in a real customer support environment is required.
👍