[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle

Created by
  • Haebom

Author

Mihran Miroyan, Rose Niousha, Joseph E. Gonzalez, Gireeja Ranade, Narges Norouzi

Outline

This paper presents the ParaStudent study, which investigates whether large-scale language models (LLMs) can generate incomplete, repetitive, and stylistically diverse codes like real students. Using a dataset of student-submitted codes collected over several semesters, we design low- and high-resolution experiments to model student progress and evaluate code outputs in terms of semantic, functional, and stylistic dimensions. We demonstrate that fine-tuning allows us to more accurately capture real student code generation processes, error patterns, incremental improvements, and style changes. In conclusion, we demonstrate that realistic modeling of student code requires capturing learning dynamics through context-aware generation, temporal modeling, and multidimensional evaluation. The experimental and evaluation codes are available at https://github.com/mmiroyan/ParaStudent .

Takeaways, Limitations

Takeaways:
We demonstrate that the LLM can be used to more realistically mimic the code generation process of real students.
Fine-tuning suggests that error patterns, incremental improvements, and style changes in student code can be more accurately captured.
Emphasizes the importance of context-aware generation, temporal modeling, and multidimensional evaluation for realistic student code modeling.
Limitations:
The dataset used in the study was limited to a specific introductory programming course, requiring further research on generalizability.
Further validation of the objectivity and reliability of the multidimensional evaluation criteria is needed.
It may still be difficult for an LLM to perfectly mimic a student's own thought process.
👍