This paper addresses the problem of continuous sign language segmentation, a critical challenge in sign language translation and data annotation. We propose a transformer-based architecture that models temporal dynamics and defines frame segmentation as a sequence labeling problem using the Begin-In-Out (BIO) tagging method. We leverage HaMeR hand features and complement them with 3D angles. Experimental results demonstrate that the proposed model achieves state-of-the-art performance on the DGS corpus, and the proposed features outperform existing benchmarks on the BSL corpus.