Abstract:
Humanity has profited enormously from the interchange of information and the expanding use of social media but it has also raised a number of challenges, such as the persistence of hate speech. This growing problem on social media platforms, latterly studies used a different type of point engineering system and machine literacy algorithms to automatically descry hate comments on numerous data. As we know, several studies have been done so far and compared several point engineering strategies with machine literacy algorithms to discover which strategy is the most productive. This investigation aims to examine the performance of multiple engineering approaches with five machine literacy algorithms. The data sets contain the class orders hate speech, not hate speech and offensive comments independently. These social media posts are split into these two groups. To recognize the particular traits of hate speech text messages, the appropriate n-gram feature sets are extracted. The n-gram TF-IDF weights provide the foundation for these feature models. The main aspiration of this research work is to analyze, and resolve the above problem and compare algorithms and features used in machine learning to automatically detect hate speech and specified them like labeling into various classes like hate speech, offensive, and neither, etc. After using different classifiers, “Random Forest” has come up with better accuracy, precision, and recall compared to SVM (Support Vector Machine), Naive Bayes, Logistic Regression, Ada Boost, and Gradient boost algorithms. This system achieved an accuracy of 90.26% using a Random Forest. The experimental result showed that the “Random Forest” provided the best all-around accuracy from the model that has been made and it is more accurate than compare to other work done in recent times on this. So, the result obtain from the model, based on the resulting intensity of the comments can be extracted.