Bangla Guruchandali Dosh Sentence Detection Using Machine Learning Techniques

Das, Rozanee Kanta; Tinni, Alaya Refat; Rinvee, Tanjina Zaman

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Das, Rozanee Kanta
dc.contributor.author	Tinni, Alaya Refat
dc.contributor.author	Rinvee, Tanjina Zaman
dc.date.accessioned	2022-10-27T03:09:56Z
dc.date.available	2022-10-27T03:09:56Z
dc.date.issued	2022-01-04
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/8780
dc.description.abstract	Our life is surrounded with technology and we can’t live without this technology. Technology is upgrade day by day. Using Natural Language Processing (NLP) techniques computer can understand human language. Now a days, by the help of NLP researcher are interested to work with text document classification. Bangla text document classification, sentiment analysis etc. are interested topic for researcher. So, in our work we are going classify Guruchandali Dosh of Bangla sentences. In our Bangla language peoples are familiar with Sadhu and Colito form. Colito form is uses in our daily life and Sadhu form is used to written Bangla literature, novel, poems etc. When two forms of Bangla language mixed up in a sentence this is called Guruchandali Dosh. We our work we are going to detect the Guruchandali Dosh sentences using supervised learning techniques. In NLP work text document are easy to preprocess and translate. So, we collect Sadhu and Colito form of data from various Bangla text book, novel, poems and newspaper. Then we make our dataset changing the sentences using some Bangla grammatical rules. Finally, we are able to collects 1712 Bangla text data. We need to preprocess our data before using the machine learning algorithms. We preprocessed our text raw data by removing unwanted data, Stop Words etc. After that we use six classification techniques to classify Guruchandali Dosh sentences. In our work we use Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), Extreme Gradient Boosting (XGB), Support Vector Machine (SVM), K-nearest neighbors (KNN) algorithms. All algorithms perform very well on our datasets. Among them Multinomial Naive Bayes (MNB) algorithm came with highest accuracy which is 85%. When we give input Bangla text data in our model, MNB model is able to predict the Guruchandali Dosh perfectly.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Automatic language processing	en_US
dc.subject	Computational linguistics	en_US
dc.subject	Machine learning	en_US
dc.title	Bangla Guruchandali Dosh Sentence Detection Using Machine Learning Techniques	en_US
dc.type	Article	en_US