Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

QSpark: Towards Reliable Qiskit Code Generation

Created by
  • Haebom

Author

Kiana Kheiri, Aamna Aamir, Andriy Miranskyy, Chen Ding

Outline

This paper explores fine-tuning a large-scale language model (LLM) using reinforcement learning (RL) techniques to improve the error tolerance of quantum circuits. To address the problem that existing LLMs, such as Granite-20B-Code and StarCoder, often generate erroneous Qiskit code, we fine-tuned the Qwen2.5-Coder-32B model on a richly annotated synthetic dataset using two RL methods, GRPO and ORPO. Experimental results show that ORPO achieves a Pass@1 performance of 56.29% on the Qiskit HumanEval benchmark, approximately 10% better than Granite-8B-QK, while GRPO achieves 49%. While these models outperform general-purpose baseline models, they still have limitations when it comes to high-difficulty tasks.

Takeaways, Limitations

Takeaways:
We demonstrate that the quantum programming performance of LLM can be improved using reinforcement learning techniques.
GRPO and ORPO algorithms outperform classical LLM on quantum programming tasks.
In particular, GRPO demonstrates excellent performance in basic and intermediate level tasks.
Limitations:
Still struggling to solve difficult quantum programming problems.
Since the training was performed using a synthetic dataset, further verification is required for performance in a real quantum programming environment.
👍