Regional accent classification in Bangladesh:

Salman; Afridi, Shahed

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

Regional accent classification in Bangladesh:

Salman; Afridi, Shahed

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/14050

Date: 2024-07-01

Abstract:

Research on the recognition of regional accents in spoken language helps preserve regional dialects and increases the accessibility of technology to speakers of such languages. We provide our work on spoken data identification from seven divisions in Bangladesh (Dhaka, Barisal, Sylhet, Chattogram, Rangpur, Rajshahi, and Mymensingh) in this study. Speech signals are sent into Automatic Language Identification systems, which then use mathematical operations to categorize the signals into different regional accents. Nine hundred and sixteen samples were obtained after the dataset was enhanced using a variety of data augmentation methods, including stretching, noise addition, pitch and speed modifications, and more. Several machine learning models, such as Gradient Boosting, Random Forest, K-Nearest Neighbors, Support Vector Machines, Decision Trees, Naive Bayes, XGBoost, and Logistic Regression, were employed for categorization purposes. According to the evaluation findings, Logistic Regression had the lowest accuracy (29.11%), while XGBoost had the best accuracy (93.91%), followed by Gradient Boosting and Random Forest at 93.16%. With the maximum accuracy of 93.46%, this study highlights the possibility of employing cutting-edge machine learning models to preserve and comprehend Bangladesh's rich language variety, improving communication technologies and fostering numerous social and technical advances.