Abstract:
Research on the recognition of regional accents in spoken language helps preserve
regional dialects and increases the accessibility of technology to speakers of such
languages. We provide our work on spoken data identification from seven divisions in
Bangladesh (Dhaka, Barisal, Sylhet, Chattogram, Rangpur, Rajshahi, and
Mymensingh) in this study. Speech signals are sent into Automatic Language
Identification systems, which then use mathematical operations to categorize the
signals into different regional accents. Nine hundred and sixteen samples were obtained
after the dataset was enhanced using a variety of data augmentation methods, including
stretching, noise addition, pitch and speed modifications, and more. Several machine
learning models, such as Gradient Boosting, Random Forest, K-Nearest Neighbors,
Support Vector Machines, Decision Trees, Naive Bayes, XGBoost, and Logistic
Regression, were employed for categorization purposes. According to the evaluation
findings, Logistic Regression had the lowest accuracy (29.11%), while XGBoost had
the best accuracy (93.91%), followed by Gradient Boosting and Random Forest at
93.16%. With the maximum accuracy of 93.46%, this study highlights the possibility
of employing cutting-edge machine learning models to preserve and comprehend
Bangladesh's rich language variety, improving communication technologies and
fostering numerous social and technical advances.