| dc.description.abstract |
This study develops and assesses neural methods for automatic transformation
of sentences from the Barishal dialect to Standard Bangla. We create a parallel
Barishal–Standard Bangla resource, perform uniform preprocessing and
tokenization, and realize a set of sequence-to-sequence models: a simple GRU
encoder–decoder, an LSTM encoder–decoder, an attention-enhanced GRU, an
ensemble that averages the logits from the three RNNs, and a lightly fine-tuned
mT5-small Transformer. The recurrent models were all trained with mixed
precision and decoded greedily, while mT5 relied on the model's generation
tooling; model performance was assessed with BLEU, TER and chrF in order to
capture n-gram precision, edit distance and character-level fidelity. On the
entire dataset the ensemble performed best (BLEU 0.7023, TER 1.32, chrF
99.05), the LSTM and vanilla GRU worked well and comparably (LSTM: BLEU
0.6713, TER 4.15, chrF 96.41; GRU: BLEU 0.6486, TER 5.10, chrF 95.55), the
attention-augmented GRU fell short compared to the basic RNNs (BLEU 0.6078,
TER 13.24, chrF 94.34), and light fine-tuning of the mT5 on the limited indomain data resulted in significantly lower scores (BLEU 0.1075, TER 92.80,
chrF 36.27). Our analysis indicates that heterogeneity and ensembling of models
bring strong benefits to low-resource dialect mapping, whereas large pre-trained
Transformers need greater in-domain data and careful scheduling in order to be
competitive. Limitations are discussed—including evaluation on the same data
without a held-out split—and future directions are sketched such as corpus
enlargement, sub word modeling, regularization of attention, and better
Transformer fine-tuning |
en_US |