| dc.description.abstract |
This paper presents the development of a Bangla Question Answering (QA) system using
advanced transformer-based models to tackle the complexities of Bangla language
processing. Specifically, it compares the performance of BanglaT5, a model fine-tuned for
Bangla, with mT5, a multilingual variant of the T5 model. Both models were evaluated on
a dataset of over 7,500 Bangla news articles, focusing on factoid-based question answering.
The results show that BanglaT5 outperforms mT5 on key metrics such as ROUGE, BLEU,
Character Error Rate (CER), and Word Error Rate (WER), showcasing its superior ability
to handle Bangla’s unique linguistic features like morphology and syntax. BanglaT5
achieved a ROUGE-1 F1 score of 0.6979, Exact Match Accuracy of 0.49, and CER of 0.4054,
demonstrating its ability to generate accurate, contextual answers. In contrast, mT5’s
performance was much lower, with an Exact Match Accuracy of 0.0008 and WER of 0.9996.
This comparison highlights the importance of fine-tuning models for specific languages
like Bangla, emphasizing the limitations of multilingual models in tasks requiring deep
linguistic understanding. The system developed in this research offers a scalable solution
for Bangla QA, with potential applications in education, public services, and digital
literacy, contributing to the growing field of Bangla NLP. Future work will focus on
deploying the model in real time, expanding the dataset, and exploring multimodal
capabilities to increase its use in real-world applications. |
en_US |