Pakistani Word-level Sign Language Recognition Based on Deep Spatiotemporal Network
DOI:
https://doi.org/10.1609/aaaiss.v6i1.36042Abstract
Sign language is crucial for the Deaf and Hard-of-Hearing community because it facilitates visual movement-based communication. Nevertheless, most are not familiar with it, rendering interactions with the hearing impaired complicated. While there has been significant work on languages, for instance, American and Chinese Sign Language, Pakistani Sign Language (PSL) at the word level has received less attention and has been studied based on static images. To address this, we introduce a deep spatiotemporal network for word-level PSL recognition from video. It commences by employing top-k frame extraction to enhance processing efficiency. Second, the ResNet-101 model is utilized for extracting deep spatial features from each frame. Subsequently, we introduce the Adaptive Motion Binary Pattern (AMBP), a new spatiotemporal feature descriptor that effectively extracts the spatiotemporal features. These spatial and spatiotemporal are fused and input into the transformer model that processes these representations for better recognition. Experimental evaluations confirm that our framework achieves state-of-the-art results.Downloads
Published
2025-08-01
How to Cite
Naeem, S., Salam, H., & Uddin, M. A. (2025). Pakistani Word-level Sign Language Recognition Based on Deep Spatiotemporal Network. Proceedings of the AAAI Symposium Series, 6(1), 119–126. https://doi.org/10.1609/aaaiss.v6i1.36042
Issue
Section
Context-Awareness in Cyber-Physical Systems