This paper proposes a realistic scenario for video learning in an unsupervised learning environment, where continuous task learning occurs without labels or task boundaries. Despite the complexity and rich spatiotemporal information of video data, video data has been understudied in the field of unsupervised continuous learning. We address the issue of previous research focusing solely on supervised learning, which relies on labels and task boundaries. Therefore, this paper studies unsupervised video continuous learning (uVCL) and presents a general benchmark experimental protocol for uVCL, taking into account the high computational and memory requirements of video processing. We utilize Kernel Density Estimation (KDE) for deep embedded video features extracted by an unsupervised Video Transformer network as a nonparametric probabilistic representation. We introduce a novelty detection criterion for new task data to dynamically expand memory clusters, thereby capturing new knowledge. We leverage transfer learning from previous tasks to serve as an initial state for knowledge transfer to the current learning task, and find that the proposed methodology significantly improves model performance when training on multiple tasks simultaneously. We perform in-depth evaluations without labels or class boundaries on three standard video action recognition datasets: UCF101, HMDB51, and Something-to-Something V2.