Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents

Created by
  • Haebom

Author

Manh Hung Nguyen, Victor-Alexandru P\u{a}durean, Alkis Gotovos, Sebastian Tschiatschek, Adish Singla

Outline

This paper explores how to provide high-quality programming tasks to students using generative AI. Existing generative AI suffers from issues such as low-quality generated tasks, difficulty for students to understand, and errors. To address these issues, we present a novel synthesis technique, PyTaskSyn. PyTaskSyn simulates expert and student agents using strong and weak generative models, and generates high-quality programming tasks through a multi-step verification process. Experimental results demonstrate that PyTaskSyn significantly improves task quality compared to existing techniques. User research using a public web application demonstrates that PyTaskSyn delivers tasks of comparable quality to those designed by experts. Furthermore, PyTaskSyn reduces workload and costs while increasing student engagement.

Takeaways, Limitations

Takeaways:
A new method for automatically generating high-quality programming tasks using generative AI is presented.
Improving the quality of existing generative AI with PyTaskSyn.
Building an Effective Verification Pipeline Through Expert and Student Agent Simulations
We have seen results in reducing workload and costs and increasing student engagement.
Presenting practical applicability through open web applications
Limitations:
The performance of PyTaskSyn may depend on the performance of the generative model used.
Further research is needed to determine generalizability across different programming languages and education levels.
Consideration should be given to limitations in generalizability due to the scale of user research and participant characteristics.
Clarity needed on the definition and criteria for expert agents.
👍