Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions

Created by
  • Haebom

Author

Zhe Liu, Taekyu Kang, Haoyu Wang, Seyed Hossein Alavi, Vered Shwartz

Outline

This paper addresses the challenge of generating diverse follow-up questions to address information gaps in small-scale, local model-based conversational agents. To address this, we developed an information gap-based knowledge distillation pipeline in which a teacher LLM generates comprehensive answers, compares them with the initial answers, identifies information gaps, and formulates follow-up questions to fill them. Using this pipeline, we expanded the existing FollowupQG dataset by a factor of 10 and fine-tuned a small student model on the expanded dataset to distill the teacher's knowledge. Experimental results on selected teacher-student model pairs showed that the fine-tuned student model significantly improved information quality and diversity compared to a variant model trained on the original dataset. This suggests that this pipeline, which mirrors the human cognitive process of information seeking, can provide an efficient distillation channel from state-of-the-art LLMs to small models, enabling the generation of more diverse and information-rich follow-up questions in resource-constrained conversational systems.

Takeaways, Limitations

Takeaways:
We present an efficient knowledge distillation pipeline that enables the generation of diverse and information-rich follow-up questions even from small-scale local models.
Achieving high performance through an approach that mimics the human information seeking cognitive process.
Improved dataset quality through 10x expansion of the existing dataset.
Presenting a method for effectively transferring cutting-edge LLM knowledge into small-scale models.
Limitations:
Only experimental results for specific teacher-student model pairs are presented, so further research is needed to determine generalizability.
Performance may vary depending on the characteristics of the LLM and dataset used.
Further evaluation is needed for application to real-world conversation systems.
👍