Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Assessing GPTZero's Accuracy in Identifying AI vs. Human-Written Essays

Created by
  • Haebom

Author

Selin Dik, Osman Erdem, Mehmet Dik

Outline

This study aimed to evaluate the reliability of GPTZero, the most widely used AI detection tool, in a situation where the use of AI detection tools such as GPTZero and QuillBot is increasing as students frequently use AI tools. We measured the success rate of GPTZero in identifying AI-generated text by randomly submitting essays of three lengths: short (40-100 words), medium (100-350 words), and long (350-800 words). We conducted experiments using a dataset of 28 AI-generated essays and 50 human-written essays, and found that GPTZero detected most of the AI-generated essays accurately (91-100% judged as AI-generated), but some false positives occurred in the human-written essays. This suggests that GPTZero is effective in detecting purely AI-generated content, but has limitations in distinguishing it from human-written text.

Takeaways, Limitations

Takeaways: GPTZero effectively detects purely AI-generated text.
Takeaways: GPTZero has limitations in clearly distinguishing between human-written and AI-generated text. There is a possibility of false positives.
Takeaways: This suggests that we should utilize various evaluation methods rather than relying solely on AI detection tools.
Limitations: The research focused only on a single AI detection tool called GPTZero. There is a lack of performance comparisons with other AI detection tools.
Limitations: Sample size may be limited. Research using larger datasets is needed.
Limitations: The analysis was based solely on essay length and did not take into account the influence of other factors (topic, style, etc.).
👍