DSpace Repository

Classification of Bangladeshi Regional Language Using Machine Learning And Deep Learning

Show simple item record

dc.contributor.author Mia, Yeasin
dc.contributor.author Tanvir, Sakhaoyat Ullah
dc.date.accessioned 2025-09-14T07:24:40Z
dc.date.available 2025-09-14T07:24:40Z
dc.date.issued 2024-07-24
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/14490
dc.description Project Report en_US
dc.description.abstract The goal of this project is to reliably detect linguistic variants and dialects by classifying regional languages spoken in Bangladesh using machine learning (ML) and deep learning (DL) techniques. The dataset has 3000 entries, with a sufficient representation of each of the five major regional languages (Chattogram: 655, Dhaka: 608, Rangpur: 621, Sylhet: 553, Noakhali: 562). The entries are distributed among these five major languages. The procedure of collecting data included developing a survey form, obtaining and preparing text samples, and cleaning data using natural language processing methods. Neural Bayes (BNB), Support Vector Machines (SVM), Random Forest, Bi-directional Long ShortTerm Memory (Bi-LSTM), Logistic Regression (LR), and Convolutional Neural Networks (CNN) were among the ML and DL models that were assessed. According to the results, DL models (Bi-LSTM: 95.24%, CNN: 98.48%) are much better at classifying regional languages than classic ML methods (Random Forest: 70.00%, SVM: 67.78%, LR: 66.22%, BNB: 64.44%). All in all, this study highlights how well DL methods capture complex linguistic patterns that are essential for problems involving the classification of regional languages. It highlights the importance of Bangladesh's language diversity from a cultural standpoint and promotes ethical research methods to help preserve languages and promote social inclusion. Prospective avenues for investigation encompass augmenting the intricacy of the model through syntactic and semantic evaluations, in addition to examining the wider sociocultural implications of language categorization technology. en_US
dc.description.sponsorship DIU en_US
dc.publisher Daffodil International University en_US
dc.subject Deep Learning en_US
dc.subject Regional Language Classification en_US
dc.subject Dialect Identification en_US
dc.subject Speech Processing en_US
dc.title Classification of Bangladeshi Regional Language Using Machine Learning And Deep Learning en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account