DSpace Repository

Speech-Based Classification of Bengali Regional Accents using Machine Learning

Show simple item record

dc.contributor.author Jahan, Naila Nushrat
dc.contributor.author Shomrat, Salman Mahmud
dc.date.accessioned 2026-06-25T03:46:58Z
dc.date.available 2026-06-25T03:46:58Z
dc.date.issued 2025-01-13
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17425
dc.description Project Report en_US
dc.description.abstract This research investigates the classification of Bengali regional accents using speech data and machine learning techniques. Accurate recognition of regional accents plays a pivotal role in improving natural language processing systems in linguistically diverse regions such as Bangladesh. Speech data was collected from various regions, including 580 audio samples from Chandpur, 535 from General Bengali, 484 from Bogura, 456 from Chittagong, 420 from Sylhet, 413 from Barishal, and 28 from other areas. The dataset was preprocessed to extract key speech features, which were then used as inputs for machine learning models.Four machine learning algorithms were applied and evaluated: Random Forest, Decision Tree, K-Nearest Neighbors, and Logistic Regression. Among these, the Random Forest model demonstrated the highest accuracy, achieving 98.12%. The Decision Tree model followed with 87.67%, while K-Nearest Neighbors and Logistic Regression attained 75.17% and 65.92%, respectively. These findings highlight the superiority of ensemble methods such as Random Forest in managing complex and diverse datasets. The study also addresses the challenges in accent classification, particularly the variability in speech patterns and the limited data availability for less-represented regions. The inclusion of the "others" category further underlines the necessity of more comprehensive and balanced datasets to improve model generalizability. This work significantly contributes to the fields of computational linguistics and speech recognition, showcasing the effectiveness of machine learning in accent classification. The exceptional performance of the Random Forest model underscores its potential for real-world applications, such as automated transcription, accent-based recommendations, and language learning systems. Future work may focus on enhancing the dataset and leveraging advanced deep learning techniques to further improve accuracy and performance. en_US
dc.description.sponsorship Daffodil International University en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Bengali Regional en_US
dc.subject Speech Recognition en_US
dc.subject Machine Learning en_US
dc.subject Accent Recognition en_US
dc.subject Computational Linguistics en_US
dc.subject Natural Language Processing (NLP) en_US
dc.title Speech-Based Classification of Bengali Regional Accents using Machine Learning en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account