Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Unsupervised Video Continual Learning via Non-Parametric Deep Embedded Clustering

Created by
  • Haebom

Author

Nattapong Kurpukdee, Adrian G. Bors

Outline

This paper proposes a realistic scenario for video learning in an unsupervised learning environment, where continuous task learning occurs without labels or task boundaries. Despite the complexity and rich spatiotemporal information of video data, video data has been understudied in the field of unsupervised continuous learning. We address the issue of previous research focusing solely on supervised learning, which relies on labels and task boundaries. Therefore, this paper studies unsupervised video continuous learning (uVCL) and presents a general benchmark experimental protocol for uVCL, taking into account the high computational and memory requirements of video processing. We utilize Kernel Density Estimation (KDE) for deep embedded video features extracted by an unsupervised Video Transformer network as a nonparametric probabilistic representation. We introduce a novelty detection criterion for new task data to dynamically expand memory clusters, thereby capturing new knowledge. We leverage transfer learning from previous tasks to serve as an initial state for knowledge transfer to the current learning task, and find that the proposed methodology significantly improves model performance when training on multiple tasks simultaneously. We perform in-depth evaluations without labels or class boundaries on three standard video action recognition datasets: UCF101, HMDB51, and Something-to-Something V2.

Takeaways, Limitations

Takeaways:
We present an unsupervised learning framework that enables continuous learning of video data without labels or task boundaries.
We present a new benchmark protocol for continuous learning of video data.
We propose an efficient video continuous learning methodology using KDE and transfer learning.
Experimentally validated performance improvements on various video action recognition datasets.
Limitations:
Further research is needed on the generalization performance of the proposed methodology.
Robustness assessment for various video data types and complexities is needed.
Need to optimize and improve the dynamic expansion strategy of memory clusters.
Need to improve efficiency in computational cost and memory usage.
👍