dc.description.abstract |
Question answering (QA) is a branch of natural language processing research that is aimed to give human users a simple and practical information retrieval application. Despite being one of the most widely spoken languages in the world, Bengali still has issues with computational linguistics. Question classification is the very first step before developing a factoid question answering system. Appropriate classification for questions is essential because the performance of the whole system depends on it. This paper demonstrates question classification for Bengali language question for developing a Bengali factoid question answering system. We collected data from different Bangla books, newspapers, nobles and so on. Then we have made our dataset with four categories like HUM, NUM, LOC and ENTY. At last, we have collected 1400 questions for our dataset. Before applying any Natural Language Processing (NLP) model, we have to preprocess our dataset. Machine learning is frequently used in classification or prediction. So, we present ten different machine learning algorithms and they are Naive Bayes (Multinomial), Naive Bayes (Gaussian), Logistic Model, K-Nearest Neighbor Model, Random Forest, Decision Tree, Support Vector Machine (Kernel = Linear), Support Vector Machine (kernel = rbf), Support Vector Machine (Kernel = sigmoid) and Support Vector Machine (Kernel = poly). Among these models, we have got the best performance measures by Logistic Model and obtained 88.57% accuracy. |
en_US |