DSpace Repository

A Transformer Based Question Answering System for Answering Open Domain Questions from Bengali Reference Text

Show simple item record

dc.contributor.author Shakil, S M Khasrul Alam
dc.contributor.author Ahmed, Md. Foysal
dc.contributor.author Sholi, Rubaiya Tasnim
dc.date.accessioned 2023-04-01T03:20:20Z
dc.date.available 2023-04-01T03:20:20Z
dc.date.issued 23-01-29
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/10073
dc.description.abstract As a task of Natural Language Processing, the Q/A system is becoming increasingly popular, particularly in the research community. In recent years, several notable works have been published. The majority of the Q/A system has been designed with the English language in mind. Aside from that, resources for English linguistics are widely present. However, despite being a popular language, particularly in the South Asian region, very few works on the Q/A system have been found in the Bengali language. As a result, the following research was carried out in the domain of Bangla Question Answering system from by fine-tuning BERT pre-trained models for Bangla Language. Fined tuned model has been trained with reference text with questions holding answers inside the text and tested with several scenarios. The expected output of the work is to find answers from the corresponding context. To improve the model's efficiency, a new dataset based on the widely used SQUAD dataset has been proposed. The constraints of the SQUAD dataset have been attempted to be overcome by our proposed dataset, since the context has been collected manually from various sources such as Wikipedia or Banglapedia and processed to eliminate grammatical errors while retaining the true sense of the sentence Preprocessing was carried out with the use of a tokenizer called "csebuetnlp/ banglabert," which was exclusively built for Bangla sentences or words from the embracing face library. Despite the fact that our dataset had limitations, our fine-tuned models managed to produce some satisfactory results, and among the 5 models we chose to work on, bert-base-cased and distilbert-base-cased produced some promising results with f1 of 0.6542 and 0.60901 and accuracy of 0.69 and 0.83, respectively. en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Natural language en_US
dc.subject Linguistics en_US
dc.subject English language en_US
dc.subject Popular language en_US
dc.title A Transformer Based Question Answering System for Answering Open Domain Questions from Bengali Reference Text en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics