Abstract:
Online comments that are visible in public spaces typically contain a big percentage of constructive comments, but a sizeable percentage also contain toxic comments. Online datasets are collected and cleaned of noise. As a result of the large number of errors in the comments, which greatly increases the number of features, before feeding the dataset to the classification models utilizing the term frequency-inverse document frequency (TF-IDF) approach, the machine learning model must first turn it into transformed raw comments for training.Six different machine learning techniques use for classify the dataset.The logistic regression algorithm is used to train the processed dataset. Decision tree classifiers use for visualize data.Random forest classification ,XGB Boost,AdaBoost Classifier,and KNN this model gives best accuracy.Then using confusion metrics for their prediction.We have applied six different machine learning techniques, such as logistic regression, decision trees, random forest classification, XGB Boost, AdaBoost Classifier, and KNN, to our dataset and got the accuracy of 0.95, 0.99, 0.99, 0.96, 0.95, and 0.92, respectively. Random forest classification and decision tree classifiers got an accuracy of 0.99, which was the highest among all classifiers.