Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SlotMatch: Distilling Temporally Consistent Object-Centric Representations for Unsupervised Video Segmentation

Created by
  • Haebom

Author

Diana-Nicoleta Grigore, Neelu Madan, Andreas Mogelmose, Thomas B. Moeslund, Radu Tudor Ionescu

Outline

This paper proposes SlotMatch, a knowledge distillation framework that effectively transfers object-centric representations to a lightweight student model for unsupervised video segmentation. To overcome the limitations of existing slot attention-based models, which are computationally expensive, SlotMatch aligns teacher and student slots using cosine similarity and operates without additional distillation objectives or auxiliary supervision. Theoretical and experimental evidence demonstrate the unnecessary integration of additional loss functions. Experimental results demonstrate that the SlotMatch-based student model performs equally or better than the best-performing teacher model, SlotContrast, while requiring 3.6x fewer parameters and being 1.9x faster. Furthermore, it outperforms existing unsupervised video segmentation models.

Takeaways, Limitations

Takeaways:
We present an efficient knowledge distillation framework that achieves performance equivalent to or better than existing best-performing models while using lightweight models.
We present a simple and effective method using only cosine similarity without any additional loss function or auxiliary supervision.
It contributes to model weight reduction and performance improvement in the field of unsupervised video segmentation.
Limitations:
Further validation of the proposed method's generalization performance is needed. Further experiments on various datasets and video types may be necessary.
Further research is needed to determine whether cosine similarity-based slot sorting is the optimal method in all situations.
Further analysis is needed to determine whether SlotMatch is dependent on a specific teacher model (SlotContrast) or whether it can be applied to other teacher models.
👍