Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems

Created by
  • Haebom

Author

Meiru Zhang, Philipp Borchert, Milan Gritta, Gerasimos Lampouras

Outline

A major challenge in automated theorem proving using LLM is identifying and leveraging prior mathematical knowledge and corresponding formal representations in languages like Lean. DRIFT proposes a novel framework that decomposes informal mathematical statements into smaller, more manageable subcomponents. This enables targeted retrieval of premises in math libraries like Mathlib. DRIFT also searches for explanatory theorems, allowing the model to utilize the premises more effectively. DRIFT has been evaluated on various benchmarks, including ProofNet, ConNF, and MiniF2F-test, consistently improving premise retrieval, nearly doubling the F1 score on ProofNet compared to DPR-based methods. In particular, DRIFT demonstrates strong performance on the out-of-distribution ConNF benchmark, achieving BEq+@10 improvements of 37.14% and 42.25% over GPT-4.1 and DeepSeek-V3.1, respectively.

Takeaways, Limitations

Takeaways:
Enhancing the premise retrieval capability of LLM through the DRIFT framework, which decomposes informal mathematical statements into subcomponents.
Demonstrates improved performance compared to existing methodologies in various benchmarks.
In particular, it achieved high performance in out-of-distribution benchmarks.
Suggests the need for an adaptive search strategy tailored to the model-specific knowledge boundaries.
Limitations:
Performance may vary depending on the model's knowledge boundary.
Failure to fully address the knowledge dependence of specific models for automatic theorem proofs.
👍