BarishaillaBarta: A Neural Comparative Study on Translating Barishal Dialect into Standard Bangla

Akter, Md. Shahin; Biplob, Asif Karim

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Akter, Md. Shahin
dc.contributor.author	Biplob, Asif Karim
dc.date.accessioned	2026-04-12T09:30:50Z
dc.date.available	2026-04-12T09:30:50Z
dc.date.issued	2025-09-17
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/16748
dc.description	Project Report	en_US
dc.description.abstract	This study develops and assesses neural methods for automatic transformation of sentences from the Barishal dialect to Standard Bangla. We create a parallel Barishal–Standard Bangla resource, perform uniform preprocessing and tokenization, and realize a set of sequence-to-sequence models: a simple GRU encoder–decoder, an LSTM encoder–decoder, an attention-enhanced GRU, an ensemble that averages the logits from the three RNNs, and a lightly fine-tuned mT5-small Transformer. The recurrent models were all trained with mixed precision and decoded greedily, while mT5 relied on the model's generation tooling; model performance was assessed with BLEU, TER and chrF in order to capture n-gram precision, edit distance and character-level fidelity. On the entire dataset the ensemble performed best (BLEU 0.7023, TER 1.32, chrF 99.05), the LSTM and vanilla GRU worked well and comparably (LSTM: BLEU 0.6713, TER 4.15, chrF 96.41; GRU: BLEU 0.6486, TER 5.10, chrF 95.55), the attention-augmented GRU fell short compared to the basic RNNs (BLEU 0.6078, TER 13.24, chrF 94.34), and light fine-tuning of the mT5 on the limited indomain data resulted in significantly lower scores (BLEU 0.1075, TER 92.80, chrF 36.27). Our analysis indicates that heterogeneity and ensembling of models bring strong benefits to low-resource dialect mapping, whereas large pre-trained Transformers need greater in-domain data and careful scheduling in order to be competitive. Limitations are discussed—including evaluation on the same data without a held-out split—and future directions are sketched such as corpus enlargement, sub word modeling, regularization of attention, and better Transformer fine-tuning	en_US
dc.description.sponsorship	Daffodil International University	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Barishal Dialect	en_US
dc.subject	Standard Bangla	en_US
dc.subject	Dialect Translation	en_US
dc.subject	Parallel Corpus	en_US
dc.subject	Ensemble Neural Models	en_US
dc.title	BarishaillaBarta: A Neural Comparative Study on Translating Barishal Dialect into Standard Bangla	en_US
dc.type	Other	en_US