Audio Speech Recognition in Bengali Language Male Female and Third Gender Based on Supervised Learning

Sheikh, Shoeb; Sejuti, Taranga Saha; Era, Isnat Jahan

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Internship Report
→
View Item

Audio Speech Recognition in Bengali Language Male Female and Third Gender Based on Supervised Learning

Sheikh, Shoeb; Sejuti, Taranga Saha; Era, Isnat Jahan

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/10254

Date: 23-02-18

Abstract:

–Is human voice frequency an integral part of vocal production, in which the vocal cords are the main source of sound? Using our hearing ability we can easily identify the classification of male, female and third gender voices. But using a machine learning approach, it’s possible to find out the difference in voice. Identifying a voice from a natural voice without any kind of noise is a very hard task. Basically, we use MFCCs to find features in voice signals. Calculating discrete Fourier Transforms, Mel-spaced filter bank energies, and log filter bank energies for voice signals is what makes this possible. Recent work has shown that, to some extent, it's possible to identify gender from natural voice. It's one of the most important aspects of voice recognition. The voice's gender is irrelevant to the voice-to-text conversion. Nevertheless, identification of gender cannot be eliminated for the sake of applications in everyday life. The process can be broken down into steps: The first step is to create features from your audio file in pre-works. Next, we will use this set of features to train the model in feature extraction. Finally, we test it with a CSV file containing some other features in order to see how well the model predicts their "true" value. It is part of Artificial Intelligence(A.I.). We used Gradient boosting, Random Forest, KNN, Decision Tree, Naive Bayes, XGBoost, SVC and Linear regression. We achieved 96.02% accuracy in a dataset containing 3,200 data points from 250 different speakers. 850 are male, 850 are female and there are 1500 of the third gender.

Show full item record