Abstract:
This research paper investigates the efficacy of various machine learning models, including deep learning and hybrid models, for text classification in the English and Bangla languages. The study focuses on sentiment analysis of comments from a popular Bengali e-commerce site, "DARAZ," which comprises both Bangla and translated English reviews. The primary objective of this study is to conduct a comparative analysis of various models, evaluating their efficacy in the domain of sentiment analysis. The research methodology includes implementing seven machine learning models and deep learning models, such as Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), Convolutional 1D (Conv1D), and a combined Conv1D-LSTM. Preprocessing techniques are applied to a modified text set to enhance model accuracy. The major conclusion of the study is that Support Vector Machine (SVM) models exhibit superior performance compared to other models, achieving an accuracy of 82.56% for English text sentiment analysis and 86.43% for Bangla text sentiment analysis using the porter stemming algorithm. Additionally, the Bi-LSTM Based Model demonstrates the best performance among the deep learning models, achieving an accuracy of 78.10% for English text and 83.72% for Bangla text using porter stemming. This study signifies significant progress in natural language processing research, particularly for Bangla, by enhancing improved text classification models and methodologies. The results of this research make a significant contribution to the field of sentiment analysis and offer valuable insights for future research and practical applications.