Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Skeleton-based sign language recognition using a dual-stream spatio-temporal dynamic graph convolutional network

Created by
  • Haebom

Author

Liangjin Liu, Haoyang Zheng, Zhengzhong Zhu, Pei Zhou

Outline

This paper proposes Dual-SignLanguageNet (DSLNet) to address the challenge of Independent Sign Language Recognition (ISLR), which challenges recognizing morphologically similar but semantically distinct sign language gestures. DSLNet employs a dual-reference, dual-stream architecture that models hand shape and movement trajectories in separate coordinate systems. It models shape in wrist-centered coordinates and context-sensitive trajectories in face-centered coordinates, extracting their respective features using topology-aware graph convolution and a Finsler geometry-based encoder. Finally, the two features are integrated using a geometry-driven optimal transport fusion mechanism. Experimental results demonstrate that DSLNet achieves state-of-the-art performance (93.70%, 89.97%, and 99.79%, respectively) on the WLASL-100, WLASL-300, and LSA64 datasets with fewer parameters than competing models.

Takeaways, Limitations

Takeaways:
Contributes to improving ISLR performance by proposing a novel approach to model hand shape and movement trajectory separately.
Overcoming the limitations of existing methods by utilizing topology-aware graph convolution and Finsler geometry-based encoder.
Effective integration of various features through a geometry-driven optimal transport fusion mechanism.
Achieve cutting-edge performance with fewer parameters than competing models.
Limitations:
Further evaluation of the generalization performance of the proposed model is needed.
Lack of performance evaluation under various lighting conditions or background environments.
Further research is needed on robustness in real-world environments.
👍