DSpace Repository

Suffix Based Automated Parts of Speech Tagging for Bangla Language

Show simple item record

dc.contributor.author Roy, Monjoy Kumar
dc.contributor.author Paul, Pinto Kumar
dc.contributor.author Noori, Sheak Rashed Haider
dc.contributor.author Mahmud, S.M. Hasan
dc.date.accessioned 2021-12-29T03:42:39Z
dc.date.available 2021-12-29T03:42:39Z
dc.date.issued 2019-04-04
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/6591
dc.description.abstract Natural language processing (NLP) is the technique by which we process the human language with the computer. Parts-of-Speech (POS) tagging is one of the fundamental requirements for some NLP applications. It is considered as a solved problem for some foreign languages, such as English, Chinese, due to higher accuracy (97%), where it is still an unsolved problem for Bangla because of its ambiguity. Although making a POS tagger for Bangla is not a new work, but each one of available POS taggers has different kinds of limitations. We choose to develop an unsupervised system rather than a supervised system, because a supervised system needs a huge data resource for training purpose and available resources in Bangla is really poor. Here we develop a POS tagger mainly based on Bangla grammar especially suffixes. Because Bangla is a very inflectional language, where a single word has many variants based on their suffixes. In this POS tagger, we assign 8 base POS tags, where some rules, based on Bangla grammar and suffix, are applied to identify POS tags with the cooperation of verb root dataset. To handle non-suffix words, a dataset of almost 14500 Bangla words, with having their default POS tags, is added with the system, which helps to increase the efficiency of this POS tagger. A modified version of previously used algorithm for suffix analysis is applied, which result in a satisfactory level of about 94.2%. en_US
dc.language.iso en_US en_US
dc.publisher 2nd International Conference on Electrical, Computer and Communication Engineering, ECCE 2019, IEEE en_US
dc.subject Bengali language en_US
dc.subject Natural language processing en_US
dc.subject Automatic language processing en_US
dc.title Suffix Based Automated Parts of Speech Tagging for Bangla Language en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics