Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration

Created by
  • Haebom

Author

Victor May, Diganta Misra, Yanqi Luo, Anjali Sridhar, Justine Gehring, Silvio Soares Ribeiro Junior

Outline

AI coding assistance tools have become essential for software development, and migrating and modernizing codebases to keep pace with the evolving software ecosystem is crucial. This paper introduces FreshBrew, a new benchmark for project-level Java migration, to systematically evaluate the effectiveness of AI agents. We benchmarked several state-of-the-art LLMs across 228 repositories and compared them with existing rule-based tools. Our results show that Gemini 2.5 Flash successfully migrated 52.3% of projects to JDK 17. This provides new insights into the strengths and limitations of current agent-based approaches and establishes a foundation for evaluating reliable code migration systems.

Takeaways, Limitations

AI agents have shown success rates of over 50% in migrating Java projects.
Gemini 2.5 Flash showed the highest performance.
The FreshBrew Benchmark provides rigorous and reproducible evaluations for AI-powered codebase modernization research.
We identified failure modes of AI agents in modern tasks.
The limitations of current AI agents have been revealed in real-world Java modernization efforts.
👍