A Noble Deep Learning Approach to Recognize Speaker’s Identity from Bengali Speech

Hossain, Md. Fahad; Ali, Hasmot; Hasan, Md. Mehedi

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Hossain, Md. Fahad
dc.contributor.author	Ali, Hasmot
dc.contributor.author	Hasan, Md. Mehedi
dc.date.accessioned	2022-02-22T05:06:28Z
dc.date.available	2022-02-22T05:06:28Z
dc.date.issued	2021-06-01
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/7227
dc.description.abstract	Speech is the most suitable form of communication. Speech-based applications are playing a vital role in modern technology for the last few decades. Because it has a lot of identical features for measuring performance and behavior of human voice. Speech-based application is not only the trend of modern and efficient technology but also a new shift of information and technology paradigm. Several research works have been completed on voice-based applications because it has more practical application than any other form of communication. In this work, we tried to recognize the feature of voice in term of identify the speakers from Bengali speech. We consider speakers Age, Division, Height, Weight, Gender, Occupation as the parameter to identify a speaker. But here we presenting the application of recognizing Bangladeshi speaker’s age and division from Bengali Speech. We used our own dataset containing 16730 samples. Each sample is a wav format audio of 8-10 seconds duration. We consider MFCC, Delta, Delta-Delta, LSF, Spectral Bandwidth and mel spectrogram features to train our model. We tried some traditional Machine Learning algorithms early but we understand that the huge number of data does better with Deep Learning algorithms. We tried different Deep Learning algorithms such as Artificial Neural Network, Convolutional Neural Network, Region Based Convolutional Neural Network, Long Short-Term Memory with different types of features but ended with Artificial Neural Network with 85% accuracy for Division recognition and Convolutional Neural Network with 78% accuracy for Age recognition.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Communication	en_US
dc.subject	Deep learning	en_US
dc.subject	Bengali speech	en_US
dc.title	A Noble Deep Learning Approach to Recognize Speaker’s Identity from Bengali Speech	en_US
dc.type	Article	en_US