Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring

Created by
  • Haebom

Author

Sinh Trong Vu, Hieu Trung Pham, Dung Manh Nguyen, Hieu Minh Hoang, Nhu Hoang Le, Thu Ha Pham, Tai Tan Mai

Outline

This paper investigates the applicability of state-of-the-art open-source visual question answering (VQA) models, such as LLaMA2, LLaMA3, QWEN3, and NVILA, to classroom behavior analysis using the BAV-Classroom-VQA dataset, which is derived from real-world classroom video recordings from the Vietnam Banking Academy. This study presents data collection and annotation methodology and benchmarks the performance of selected VQA models, demonstrating promising performance on behavioral visual questions, thereby demonstrating their potential as future classroom analysis and intervention systems.

Takeaways, Limitations

Takeaways:
We demonstrate that state-of-the-art VQA models can be effectively applied to classroom behavior analysis.
The BAV-Classroom-VQA dataset can be a valuable resource for classroom behavior analysis research.
It can contribute to the development of future class analysis and intervention systems.
Limitations:
To date, only initial experimental results have been presented, and more extensive and in-depth experiments are needed.
There may be limitations on the size and diversity of the dataset.
Additional consideration is needed for various variables (lighting, camera angle, etc.) that can impede model performance.
Additional verification and supplementation are needed for application in actual educational settings.
👍