Abstract:
A population of 430 million people and above, or over a population of 5% of the world’s population, needs therapy to treat their “disabled” hearing and speaking condition. These people have the option to learn sign language to communicate with others. Hence, our project mainly targets the deaf and mute community. Around 5000 images of hand gestures have been used and divided into 10 categories for live detection. The categories are mainly American Sign Language (ASL) and are consisted of the first 10 numbers. Our model can detect these ten hand motions and categorize them correctly. We used the You Only Look Once Version 5 algorithm. The algorithm consists of a backbone namely CSPDarknet53, in which an SPP block is accustomed to accelerating the speed of the receptive field responsible to set apart prime traits and confirming that network operation speed is inclining in speed. The neck of the algorithm, PAN, is added to aggregate the parameters from different backbone levels. This model is very easy to use and understand and gives an accuracy above 98%. That is why we chose YoloV5 as our model for object detection due to its simplicity in usage. Therefore, an artificial sign language detection system has been suggested in this study which incorporates deep learning and image processing method. This study also gives a comparison between the two models to give a better understating of why we marked YoloV5 as a better algorithm even though both models gave an accuracy of above 98%. We believe that making a hand gesture detection system will encourage individuals to communicate with people who cannot hear or speak. That being the case, we aim to make the lives of the disabled better.