Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Self-Questioning Language Models

Created by
  • Haebom

Author

Lili Chen, Mihir Prabhudesai, Katerina Fragkiadaki, Hao Liu, Deepak Pathak

Outline

This paper investigates whether pre-trained language models can enhance their inference capabilities by generating questions and answers on their own without external data. To achieve this, we propose a method that provides only a single prompt, specifying a topic (e.g., an algebraic problem) and allowing the model to generate questions on its own. We present Self-Questioning Language Models (SQLM), an asymmetric self-learning framework consisting of a proposer (for generating questions) and a solver (for generating answers), both trained using reinforcement learning. The proposer is rewarded for generating problems of appropriate difficulty, while the solver is rewarded based on majority votes (or approximations if no correct answer is found). For coding problems, the proposer generates unit tests and uses them for validation. We demonstrate this framework on three benchmarks: three-digit multiplication, algebra problems from the OMEGA benchmark, and programming problems from Codeforces, demonstrating that the framework can improve language model performance without an external training dataset.

Takeaways, Limitations

Takeaways:
Suggesting the possibility of improving the inference ability of language models without external data.
Proposing an efficient learning method through a self-learning framework.
Applicability to various problem types (math, coding)
A new paradigm that overcomes the limitations of existing learning based on massive datasets.
Limitations:
Accuracy limits of using majority voting as an approximation of the correct answer
Absence of objective evaluation criteria for the quality of self-generated problems
Need to verify generalization performance for complex and diverse types of problems
Further research is needed on large-scale experiments and various models.
👍