Abstract:
The development of automatic speech recognition systems for Arabic presents unique challenges, particularly due to the complexities of Classical Arabic, which has been underrepresented in research. Proper pronunciation of Arabic letters is crucial as it affects word meanings significantly. This study introduces novel learning models designed for Arabic letter classification with an emphasis on accurate pronunciation. The task is bifurcated into two stages: firstly, training the model for Arabic letter recognition, and secondly, evaluating the pronunciation quality of these letters. Given the scarcity of relevant audio datasets, I have collected audio samples from both experts and novices for training purposes. Pronunciation features were extracted from these audio samples using mel-spectrograms. I implemented deep convolutional neural networks (DCNN), Transfer learning model of AlexNet, and Audio LSTM (BLSTM networks). The recognition accuracies of DCNN, AlexNet and BLSTM are 98.97%, 98.45% and 91.39 %. DCNN, AlexNet, and BLSTM models provided accuracy of 97.87%, 99.14%, and 77.78% for the quality of pronunciations.For Quranic Ayat pronunciation correction, the models achieved a Word Error Rate (WER) of 8.34% and a Character Error Rate (CER) of 2.42%. This study underscores the effectiveness of advanced neural networks in enhancing Arabic speech recognition and pronunciation evaluation