English
Share
Sign In
Subscribe
Concerns about using AI to assess grades
콘텐주
👍
1.
Lack of understanding and qualitative judgment
GenAI generates probabilistic outputs based on training data patterns and lacks true understanding and judgment capabilities.
2.
Inconsistency and bias in performance evaluation
Scores for the same task can vary greatly, compromising fairness and reliability.
3.
Potential for social prejudice and discrimination
Biases in the training data may be reflected in the model, which may disadvantage certain groups.
4.
Worsening equity gap
Gaps in accessibility to AI tools could widen the achievement gap.
5.
Differences in Accessibility Between Educational Institutions
A gap may develop between institutions that have access to more powerful models and those that do not.
6.
True lack of understanding
AI only generates output based on patterns in the training data, evaluating without truly understanding it.
7.
Issues of fairness, accountability and transparency
The apparent objectivity of AI-generated grades can obscure gaps and impede fair evaluation.
8.
Differences between humans and artificial intelligence
AI cannot replace the judgment and expertise of human educators, but should focus on assisting them.
9.
Rethinking the Evaluation Method
We need to rethink the purpose and methods of grading and explore different ways for students to demonstrate learning.
Generative AI (GenAI) may seem appealing in terms of its potential to reduce the workload of assessment and reporting. As a former secondary English teacher, senior assessment evaluator, and early teacher education instructor, I understand the appeal of using GenAI to grade student work. However, I am convinced that this technology is fundamentally unsuited to high-stakes student assessment.
GenAI basically generates probabilistic outputs based on patterns in the training data, and lacks true understanding and qualitative judgment. This leads to inconsistencies and biases in grading, raising serious concerns about fairness and reliability. When the same 9th grade persuasive writing assignment was fed into ChatGPT multiple times, the scores changed significantly, from 78 to 95, just by changing the student name. These inconsistencies completely undermine the fairness and reliability of the grading process.
What’s even more concerning is that GenAI models are trained on massive datasets collected from the internet, which could potentially include social bias and discrimination. Based on their writing, the models could infer attributes about students, such as race, gender, and background, which could disadvantage certain groups. These biases are much harder to detect and address than the biases of human graders. We have strategies like anonymization and moderation, but for AI, simply removing a student’s name is not enough. The biases are embedded in the model at a much deeper level, based on the training data and the patterns the model has learned.
Using AI for grading has the potential to exacerbate existing equity gaps in education. Access to AI tools is never equal. Students from low-income families rely on basic, free AI tools, while students from wealthier backgrounds can afford to pay for subscriptions to advanced models like GPT-4. Students with access to better AI tools can generate, iterate, and improve high-quality work, and potentially exploit the system when leveraging things like detection tools. Over time, this can exacerbate achievement gaps and reinforce privilege.
It’s also important to consider that schools, universities, and educators have access to different levels of GenAI. Consider what happens to a student who is evaluated by GPT-4 compared to GPT-3.5. So far, efforts to “de-bias” AI have been only moderately successful, and only against more powerful models. For example, GPT-4, while not perfect, is less biased than GPT-3.5. Therefore, educators or institutions with the financial and technological resources to use more powerful models will be able to provide more sophisticated and potentially less biased feedback.
Fundamentally, using language models (LLMs) to assess student work is problematic, no matter how sophisticated the prompts or inputs. LLMs produce probabilistic outputs based on patterns in the training data without any true inference or understanding. This is true even if the inputs include detailed rubrics, specific grading criteria, or sample student work with evaluated comments. These additional contextual elements can help anchor LLM responses and make them more persuasive, but ultimately the output remains a product of statistical inference rather than true understanding.
The use of AI in high-stakes assessments also raises concerns about fairness, accountability, and transparency. Human graders can exhibit bias, but they can be trained to recognize and mitigate it. In contrast, bias in LLMs is inherent in the training data and architecture, and can be more difficult to identify and address. The appearance of objectivity in AI-generated grades can mask fundamental gaps and hinder efforts to ensure fair assessment.
Using LLM to assess student work is a misuse of technology that fails to recognize the fundamental differences between humans and AI. Rather than trying to automate the assessment process, we should focus on using AI to support and enhance the abilities of human educators, while preserving the essential role of human judgment and expertise in assessing student learning.
We need to start by questioning why we grade in the first place. Are assessments and grading simply a matter of competition, ranking, and placement? Or, as TEQSA recently asked in relation to the use of AI tools in assessments, are there other ways for students to demonstrate learning, and other ways for educators to assess whether students are learning?
At the very least, generative AI is forcing us to have these difficult and sometimes uncomfortable conversations.
#ChatGPT #StudentEvaluation #AIBias #EducationGap #RethinkEvaluationMethods
Subscribe to '오늘배움'
Grow with Learn Today!
Discover the latest edutech trends and innovative learning solutions. Learn Today Co., Ltd. has established partnerships with various overseas edutech companies and provides only the best services.
By subscribing, you can receive the latest information necessary for future education, including metaverse, AI, and collaboration platforms.
Subscribe to Learn Today today and prepare for tomorrow's education!
Subscribe
👍