Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

VeriCoder: Enhancing LLM-Based RTL Code Generation through Functional Correctness Validation

Created by
  • Haebom

Author

Anjiang Wei, Huanmi Tan, Tarun Suresh, Daniel Mendoza, Thiago SF X. Teixeira, Ke Wang, Caroline Trippel, Alex Aiken

Outline

This paper presents a study on the application of large-scale language models (LLMs) to electronic design automation (EDA) tasks, specifically register-transfer level (RTL) code generation. While existing RTL datasets focus solely on syntactic validity and lack functional verification, VeriCoder is an RTL code generation model fine-tuned on a dataset verified for functional correctness. Using a novel methodology that combines unit test generation and feedback-driven refinement, we build a dataset of 125,777 functionally verified examples consisting of natural language specifications, RTL implementations, and passing tests. A teacher model based on GPT-4o-mini is used to generate unit tests, and the RTL design is iteratively refined based on simulation results. VeriCoder achieves state-of-the-art performance in functional correctness on VerilogEval and RTLLM, with performance improvements of up to 71.7% and 27.4% over existing models. Furthermore, we present experimental results demonstrating that models trained on a functionally verified dataset outperform models trained on an unverified dataset. The code, data, and models are publicly available.

Takeaways, Limitations

Takeaways:
We emphasize the importance of high-quality RTL datasets that have been verified for functional correctness.
We present a methodology for building datasets through unit test generation and feedback-based improvement.
We present the VERICODER model, which achieves state-of-the-art performance on VerilogEval and RTLLM.
Increase the reproducibility and scalability of research by making data and models publicly available.
Limitations:
The size of your current dataset may be relatively small compared to larger datasets.
It relies heavily on powerful teacher models like GPT-4o-mini.
Generalization performance for complex RTL designs requires further research.
It may be biased towards certain types of RTL designs.
👍