dc.description.abstract |
Sentiment Analysis is the automated mining of user-generated opinionated text data such as reviews, comments, and feedback. Sentiment Analysis classifies those text data into their respective sentiments of positive, negative, or neutral. Most of the researchers focused into this domain using one of the three classifiers like SVM, Naïve Bayes, and maximum entropy. In machine learning, there are number of classifier models available. In this proposedapproach there will be more focus on Mathematical Analysis and Natural Language Processing. The combinational difference between two subsets will provide the answer of movie review being positive or negative. In case of Natural Language Processing three algorithm has been used in this proposed model respectively Cooccurrence matrix, Knowledge Graph Naïve Bayes. To measure the combinational ratio of two subsets, Jaccard Distance has been used. Jaccard Distance is a pretty common technique in Mathematical and Big Data Analysis. In Feature Selection Jaccard Distance and Lexicon Based Approach has been used into proposed model. Cooccurrence Matrix has been used to extract thefeatureselection.And to classify Knowledge Graph , Naive Bayes has been used. For determine the accuracy of the model “k-fold cross validation” has been performed. Keeping the value of k = 50.There are many researches on Naïve Bayes Algorithm and Knowledge Graph Algorithm. But none of the researchers focused on the importance of merging the set techniques to perform Sentiment Analysis. This proposed model has shown how the set wotechniques can be merged together as a classifier.The co_occurrence frequency of eachpairof words taken through Knowledge Graph. And occurrence frequency of each wordistakenthrough Co_Occurrence Matrix.Combining the both co_occurrence and occurrencefrequency have been taken to perform a probabilistic equation and traditional Naive Bayes Algorithm to measure the ratio of a context.The context results in two types of ratiopositive and negative.If positive ratio is higher than the negative ratio the context will bepositive else negative.Proposed Approach provides very comprehensive results onstandarddatasets.Out of two standard movie review data-sets, for one data-set proposedmodeloutperformed all the previous result with accuracy of 88.56% and for other standardmoviereview data-set, it provides accuracy of91.82%. |
en_US |