Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Evaluating undergraduate mathematics examinations in the era of generative AI: a curriculum-level case study

Created by
  • Haebom

Author

Benjamin J. Walker, Nikoleta Kalaydzhieva, Beatriz Navarro Lameda, Ruth A. Reynolds

Outline

This study examines the impact of generative AI (GenAI) tools, such as OpenAI's ChatGPT, on traditional assessment practices. Specifically, as concerns about academic integrity and educational alignment arise as universities consider remote testing, we explore whether traditional closed-ended mathematics exams maintain their educational relevance in an unsupervised environment with GenAI access. To this end, we empirically generated, recorded, and blind-marked GenAI responses to eight undergraduate mathematics exams across the entire first-year curriculum at a Russell Group university. We evaluated GenAI's performance at both the module level and the overall first-year curriculum level, finding that GenAI's performance was comparable to that of a first-year degree.

Takeaways, Limitations

Takeaways:
GenAI's math test performance was shown to be at a significant level.
There is a need to redesign the way mathematics is assessed in unsupervised environments.
In the age of generative artificial intelligence, the educational value of current evaluation criteria may diminish.
Limitations:
The specific Limitations is not specified in the paper.
👍