Advancing Sign Language Recognition: A YOLO v.11-Based Deep Learning Framework for Alphabet and Transactional Hand Gesture Detection
DOI:
https://doi.org/10.1609/aaaiss.v6i1.36055Abstract
Sign language recognition is an essential tool that facilitates communication for those with hearing and speech disabilities. Conventional recognition techniques frequently encounter challenges in real-time performance, resilience, and accuracy owing to fluctuations in hand positions, backdrops, and lighting conditions. This paper presents a YOLOv11-based deep learning system for recognising American Sign Language (ASL), concentrating on both alphabetic and transactional hand motions to mitigate existing constraints. The model is engineered to function in real-time while ensuring high precision and resilience across varied contexts. The methodology adheres to a systematic pipeline, commencing with dataset gathering and pre-processing, which include image augmentation, normalisation, and scaling to guarantee model generalisation. The YOLOv11 architecture utilises an improved backbone, neck, and detecting head for effective feature extraction and classification. Training is enhanced by the utilisation of the AdamW optimiser, a meticulously adjusted learning rate, and a loss function that integrates box loss, classification loss, and distribution focal loss (DFL). Performance is assessed using precision, recall, mean Average Precision (mAP), and inference rate to guarantee the model's accuracy and efficiency. Experimental findings indicate that the suggested model attains 95.4% precision, 94.8% recall, and 98.1% mean Average Precision (mAP), markedly surpassing conventional methods. The amalgamation of GRAD-CAM with occlusion sensitivity significantly improves model interpretability. This research offers a robust and scalable approach for real-time sign language detection, facilitating enhanced accessibility in communication technologies, assistive devices, and interactive systems.Downloads
Published
2025-08-01
How to Cite
Elgohr, A. T., Elhadidy, M. S., El-geneedy, M., Akram, S., & Mousa, M. A. A. (2025). Advancing Sign Language Recognition: A YOLO v.11-Based Deep Learning Framework for Alphabet and Transactional Hand Gesture Detection. Proceedings of the AAAI Symposium Series, 6(1), 209–217. https://doi.org/10.1609/aaaiss.v6i1.36055
Issue
Section
Human-AI Collaboration: Exploring Diversity of Human Cognitive Abilities and Varied AI Models for Hybrid Intelligent Systems