Bangla News Article Categorization Using Machine Learning

Haque, MD Al Shahriar; Shawda, Umme

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Thesis
→
View Item

Bangla News Article Categorization Using Machine Learning

Haque, MD Al Shahriar; Shawda, Umme

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/10340

Date: 23-02-12

Abstract:

Bangla language got familiar many years ago in the world and Many online Bangla news portals are growing day by day. We can get news within a few seconds with their help of them. Some media are telecasting news by live stream and some are publishing news through online news portals. With their help of them, much news is being published day by day. We are familiar with many new things by seeing/reading the news. This news is not separated by its specific categories the problem arrives because every people don’t like every category. For this reason, they feel disturbed to read the news but very few researchers are working in Bangla news and at this time data gap is increasing very rapidly. In this paper, we try to solve this problem by Machine learning. we collect data by the web crawler. Our dataset has 408470 rows and collects data 120 thousand. We use label mapping for category labeling and to get sequence we use a tokenizer, for data preprocessing we use a slicer to get the same sample in every category. We use flatten, embedding, and dense, and we use ‘adam’ optimizer, for loss function ‘sparse categorical cross entropy’, for visualizations we use a heatmap, and confusion matrix, for classification we use some classifiers like SVM, KNN, decision tree, random forest, naive bayes, Gradient Boosting Classifier. After using the decision tree, and random forest we get a training accuracy is 98.08%.

Show full item record