DSpace Repository

Suffix Based Automated Parts of Speech Tagging for Bangla Language

Show simple item record

dc.contributor.author Roy, Monjoy Kumar
dc.contributor.author Paul, Pinto Kumar
dc.date.accessioned 2019-07-06T04:42:58Z
dc.date.available 2019-07-06T04:42:58Z
dc.date.issued 2018-12-11
dc.identifier.uri http://hdl.handle.net/123456789/2711
dc.description.abstract Natural language processing (NLP) is the technique by which we process the human language with the computer. Parts-of-Speech (POS) tagging is one of the fundamental requirements for some NLP applications. It is considered as a solved problem for some foreign languages, such as English, Chinese, due to higher accuracy (97%), where it is still an unsolved problem for Bangla because of its ambiguity. Although making a POS tagger for Bangla is not a new work, but each one of available POS taggers has different kinds of limitations. We choose to develop an unsupervised system rather than a supervised system, because a supervised system needs a huge data resource for training purpose and available resources in Bangla is really poor. Here we develop a POS tagger mainly based on Bangla grammar especially suffixes. Because Bangla is a very inflectional language, where a single word has many variants based on their suffixes. In this POS tagger, we assign 8 base POS tags, where some rules, based on Bangla grammar and suffix, are applied to identify POS tags with the cooperation of verb root dataset. To handle non-suffix words, a dataset of almost 14500 Bangla words, with having their default POS tags, is added with the system, which helps to increase the efficiency of this POS tagger. A modified version of previously used algorithm for suffix analysis is applied, which result in a satisfactory level of about 94.2%. en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.relation.ispartofseries ;P12220
dc.subject Computer Science en_US
dc.subject Language Automation en_US
dc.subject Language Processing en_US
dc.title Suffix Based Automated Parts of Speech Tagging for Bangla Language en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics