Abstract:
Emotion recognition involves identifying and interpreting the emotions of an individual, which can be conveyed through facial expressions, verbal communication, or written text. Bengali, a low-resource language, has seen a surge in the amount of written text data available in recent years, making the task of emotion classification in Bengali text increasingly important for a range of applications including sports, e-commerce, entertainment, and security. However, the lack of appropriate language processing tools and benchmark corpora makes emotion classification in Bengali text a challenging task. In this study, we propose a deep learning approach for classifying Bengali text data into one of six basic emotion categories: anger, fear, disgust, sadness, joy, and surprise. To this end, we develop a Bengali emotion corpus comprising 29,846 sentences and 40,718 unique words. We also explore various word embedding techniques, including Word2Vec, FastText, and the Keras Embedding Layer, to find the most effective features for Bengali text emotion classification. We then evaluate several machine learning and deep learning models, including MNB, LR, SVM, CNN, LSTM, and the proposed Keras Embedding+CNN, on the corpus using different feature extraction and word embedding techniques. The results demonstrate that the CNN-based method with Keras Embedding word embedding achieves the highest accuracy of 74.40% on the test dataset.