DSpace Repository

A Method for Bengali Author Detection Using Supervised Classification Models

Show simple item record

dc.contributor.author Hamid, Md. Abdul
dc.contributor.author Rahman, Md. Tanjil
dc.contributor.author Islam, Md. Fahim
dc.date.accessioned 2023-04-05T08:24:41Z
dc.date.available 2023-04-05T08:24:41Z
dc.date.issued 23-01-29
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/10152
dc.description.abstract Text classification is an important area of study in the field of NLP. We live in a modern world where everyone values their intellectual property. Intellectual property includes digital written ideas, blogs, poems, novels, and posts, among other things. Evil people try to steal valuable intellectual property from others and claim it as their own or pirate these properties. To avoid these problems, we created several models based on the art-of-states Supervised method for determining authorship from a given Bangla text. Because our work is a multi-class classification, we can use it to determine who created articles, news, or messages. Authorship detection can be used to identify anonymous authors as well as detect plagiarism. This article focuses on categorizing five authors in the context of Bengali text. These five authors are well-known figures in Bengali literature and poetry. Humayun Ahmed, Rabindranath Tagore, Muhammad Zafar Iqbal, Kazi Nazrul Islam, and Sarat Chandra Chattopadhyay are among those honored. Data is being gathered from over 4500 paragraphs. For the experimental evaluation, a dataset is created. We preprocess Bengali text for training purposes. Logistic regression, naive Bayes, decision trees, SVM, Random Forest, XG-Boost, and KNN are among the seven supervised classification methods used. Our deep learning Bi-Lstm model outperforms the seven supervised models in terms of accuracy. By mentioning all models, the transformers-based model, Bert uncased model learns the context very well. Bi-Lstm was used in our experiment. Bi-Lstm and Bert uncased model provides the best experimental classification report in our experiment. The Bi-Lstm model loss function yields 0.3789 with a maximum accuracy of 88% and Bert base uncased F1-Score gives 91 % accuracy. en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Bengali literature en_US
dc.subject Classification en_US
dc.subject Logistic regression en_US
dc.title A Method for Bengali Author Detection Using Supervised Classification Models en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics