Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Sign Spotting Disambiguation using Large Language Models

Created by
  • Haebom

Author

JianHe Low, Ozge Mercanoglu Sincan, Richard Bowden

Outline

This paper presents a novel, training-free sign language identification and sign spotting framework that integrates a large-scale language model (LLM) to address the data shortage problem in sign language translation. Unlike existing approaches, this study extracts global spatiotemporal and hand shape features and compares them against a large-scale sign language dictionary using dynamic time warping and cosine similarity. The LLM performs context-aware lexical interpretation via beam search without fine-tuning, mitigating noise and ambiguity arising from the matching process. Experimental results using synthetic and real sign language datasets demonstrate improvements in accuracy and sentence fluency compared to existing methods.

Takeaways, Limitations

Takeaways:
We demonstrate that LLM can be used to improve sign language identification accuracy and sentence fluency without training.
Increased lexical flexibility through dictionary-based matching.
Effectively mitigate noise and ambiguity through context-aware lexical interpretation.
Contributes to streamlining the annotation of large-scale sign language datasets.
Limitations:
The performance of LLM may depend on the quality and size of the dictionary.
Additional robustness verification is needed against the complexities of real sign languages (e.g., different sign styles, background noise).
It may be dependent on a specific LLM and performance may vary when applying a different LLM.
The computational cost of the lexical analysis process using beam search can be high.
👍