Audio Speech Recognition in Bengali Language Male Female and Third Gender Based on Supervised Learning

Sheikh, Shoeb; Sejuti, Taranga Saha; Era, Isnat Jahan

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Internship Report
→
View Item

dc.contributor.author	Sheikh, Shoeb
dc.contributor.author	Sejuti, Taranga Saha
dc.contributor.author	Era, Isnat Jahan
dc.date.accessioned	2023-05-03T04:40:14Z
dc.date.available	2023-05-03T04:40:14Z
dc.date.issued	23-02-18
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/10254
dc.description.abstract	–Is human voice frequency an integral part of vocal production, in which the vocal cords are the main source of sound? Using our hearing ability we can easily identify the classification of male, female and third gender voices. But using a machine learning approach, it’s possible to find out the difference in voice. Identifying a voice from a natural voice without any kind of noise is a very hard task. Basically, we use MFCCs to find features in voice signals. Calculating discrete Fourier Transforms, Mel-spaced filter bank energies, and log filter bank energies for voice signals is what makes this possible. Recent work has shown that, to some extent, it's possible to identify gender from natural voice. It's one of the most important aspects of voice recognition. The voice's gender is irrelevant to the voice-to-text conversion. Nevertheless, identification of gender cannot be eliminated for the sake of applications in everyday life. The process can be broken down into steps: The first step is to create features from your audio file in pre-works. Next, we will use this set of features to train the model in feature extraction. Finally, we test it with a CSV file containing some other features in order to see how well the model predicts their "true" value. It is part of Artificial Intelligence(A.I.). We used Gradient boosting, Random Forest, KNN, Decision Tree, Naive Bayes, XGBoost, SVC and Linear regression. We achieved 96.02% accuracy in a dataset containing 3,200 data points from 250 different speakers. 850 are male, 850 are female and there are 1500 of the third gender.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	MFCC	en_US
dc.subject	MFCCS	en_US
dc.subject	Feature Extraction	en_US
dc.title	Audio Speech Recognition in Bengali Language Male Female and Third Gender Based on Supervised Learning	en_US
dc.type	Other	en_US