Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

VerifiAgent: a Unified Verification Agent in Language Model Reasoning

Created by
  • Haebom

Author

Jiuzhou Han, Wray Buntine, Ehsan Shareghi

Outline

Large-scale language models exhibit remarkable inference capabilities, but often generate unreliable or incorrect responses. Existing verification methods are typically model-specific or domain-limited, require significant computational resources, and lack scalability for diverse inference tasks. To address these limitations, this paper proposes VerifiAgent, an integrated verification agent that integrates two levels of verification. Meta-verification assesses the completeness and consistency of model responses, while tool-based adaptive verification allows VerifiAgent to autonomously select appropriate verification tools based on the type of inference, including mathematical, logical, or common-sense inference. This adaptive approach ensures both efficiency and robustness in diverse verification scenarios. Experimental results demonstrate that VerifiAgent outperforms baseline verification methods (e.g., deductive verifiers and backward verifiers) across all inference tasks. Furthermore, feedback from verification results can be utilized to further improve inference accuracy. VerifiAgent can also be effectively applied to inference scaling, achieving better results with fewer generated samples and at lower cost compared to existing process-compensation models in the mathematical inference domain. The code can be found at https://github.com/Jiuzhouh/VerifiAgent .

Takeaways, Limitations

Takeaways:
Proposing VerifiAgent, an integrated verification agent that integrates meta-verification and tool-based adaptive verification.
Efficient and robust verification for various types of inference.
Achieving superior inference accuracy and scalability over existing methods.
In the domain of mathematical reasoning, the possibility of efficient inference extension is presented.
Limitations:
VerifiAgent's performance may depend on the quality of the verification tool used.
There is a need to evaluate adaptability to new types of reasoning or domains.
Further research is needed on scalability and computational costs for large datasets.
👍