DSpace Repository

Predicting Novel Authorship in Bangla Literature Using Large Language Model's

Show simple item record

dc.contributor.author Anwar, Tabassum
dc.date.accessioned 2026-04-26T09:27:52Z
dc.date.available 2026-04-26T09:27:52Z
dc.date.issued 2025-12-27
dc.identifier.citation SWT en_US
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17053
dc.description Thesis Report en_US
dc.description.abstract Large Language Models (LLMs) have opened up a whole new horizon for literary analysis, authorship, and computational linguistics. But as yet, there has been limited research work conducted regarding the application of these models for Bangla literature, although Bangla has a vast tradition of literature, and most of the works have been digitized. This research work has been initiated with the motivation to close this gap and design an authorship prediction model for Bangla novels using the recent advances in LLMs. A data set was formed after the collection, pre-processing, and segmentation of texts from renowned Bangla writers like Rabindranath Tagore, Kazi Nazrul Islam, and Sarat Chandra Chattopadhyay. The texts were pre-processed and tokenized in order to obtain suitable input for the training of transformer models. The different pre-trained LLMs, namely BanglaBERT, mBERT, and XLM-RoBERTa models, were fine-tuned for classification in terms of authorship after identifying characteristics in the texts. Accuracy, precision, recall, and F1-score are utilized to evaluate trained models. Analysis of the results indicates that LLMs can effectively identify unique writing styles and subtle differences of various authors. The models performed well on accuracy and are considered impressive compared to traditional machine-learning techniques. This work gives a feasible approach towards author identification in the Bangla language and shows the potential of LLMs in the development of the digital humanities as well as authorial text analyses of literary works in low-resource languages. This can pave the way for other research related to plagiarism, literary forensics, and the conservation of the Bangla language identity in the domain of AI. en_US
dc.description.sponsorship DIU en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Bangla Authorship Attribution en_US
dc.subject Large Language Models (LLMs) en_US
dc.subject Text Classification en_US
dc.subject Computational Linguistics en_US
dc.title Predicting Novel Authorship in Bangla Literature Using Large Language Model's en_US
dc.type Working Paper en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account