Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Dual-Stage Reweighted MoE for Long-Tailed Egocentric Mistake Detection

Created by
  • Haebom

Author

Boyu Han, Qianqian Xu, Shilong Bao, Zhiyong Yang, Sicong Li, Qingming Huang

Outline

This paper addresses the problem of identifying user inaccurate actions from egocentric video data. To handle subtle and rare mistakes, we propose a Dual-Stage Reweighted Mixture-of-Experts (DR-MoE) framework. In the first stage, features are extracted using a fixed ViViT model and a LoRA-tuned ViViT model, which are then combined through a feature-level expert module. In the second stage, three classifiers are trained using reweighted cross-entropy to mitigate the imbalanced class problem, AUC loss to improve ranking in skewed distributions, and label-aware loss and sharpness-aware minimization to enhance calibration and generalization. Their predictions are fused using a class-level expert module. The proposed method demonstrates particularly robust performance in identifying rare and ambiguous mistakes.

Takeaways, Limitations

Takeaways:
The DR-MoE framework effectively identifies users' inaccurate behaviors in egocentric video data.
It is strong in identifying rare and ambiguous error cases.
The efficiency of feature extraction was improved by utilizing the ViViT model and LoRA tuning.
Performance was improved by using various loss functions and expert modules.
Limitations:
Lack of information about specific experimental results and datasets.
Lack of information about the computational complexity of the proposed method.
Lack of comparative analysis with other existing methods.
👍