Sign Language Recognition Using the Fusion of Image and Hand Landmarks Through Multi-Headed Convolutional Neural Network

Pathan, Refat Khan; Biswas, Munmun; Yasmin, Suraiya; Khandaker, Mayeen Uddin; Salman, Mohammad; Youssef, Ahmed A. F.

DSpace Home
→
DIU Faculty Publication
→
Articles
→
View Item

Sign Language Recognition Using the Fusion of Image and Hand Landmarks Through Multi-Headed Convolutional Neural Network

Pathan, Refat Khan; Biswas, Munmun; Yasmin, Suraiya; Khandaker, Mayeen Uddin; Salman, Mohammad; Youssef, Ahmed A. F.

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/13236

Date: 2023-10-09

Abstract:

Sign Language Recognition is a breakthrough for communication among deaf-mute society and has been a critical research topic for years. Although some of the previous studies have successfully recognized sign language, it requires many costly instruments including sensors, devices, and high-end processing power. However, such drawbacks can be easily overcome by employing artificial intelligence-based techniques. Since, in this modern era of advanced mobile technology, using a camera to take video or images is much easier, this study demonstrates a cost-effective technique to detect American Sign Language (ASL) using an image dataset. Here, “Finger Spelling, A” dataset has been used, with 24 letters (except j and z as they contain motion). The main reason for using this dataset is that these images have a complex background with different environments and scene colors. Two layers of image processing have been used: in the first layer, images are processed as a whole for training, and in the second layer, the hand landmarks are extracted. A multi-headed convolutional neural network (CNN) model has been proposed and tested with 30% of the dataset to train these two layers. To avoid the overfitting problem, data augmentation and dynamic learning rate reduction have been used. With the proposed model, 98.981% test accuracy has been achieved. It is expected that this study may help to develop an efficient human–machine communication system for a deaf-mute society.

Show full item record