Abstract:
The academic community has recently become interested in categorizing medical datasets using machine learning, despite the fact that it is a challenging task. The conclusion of the procedures is aided by the use of numerous machine learning algorithms to a collection of data. The use of machine learning to predict disease has been the subject of numerous studies in the past. However, there are several chances for development. This work aims to examine alternative machine learning-based models for diabetes prediction using pre-processing techniques, classical classifiers, and ensemble classifiers. The major cause of death globally during the past few decades has been heart disease, sometimes referred to as cardiovascular disease. It includes a variety of disorders that have an impact on the heart. One of the hardest difficulties in the medical industry right now is the prognosis of heart disease. There are numerous risk factors associated with heart disease, and it is urgent to find accurate, trustworthy, and practical methods to make an early diagnosis and achieve fast disease management. Machine learning approaches have advanced the health industry by several researches as a result of current technological advancements. The purpose of this study is to develop an ML model for heart disease prediction using the relevant factors. For this research project, we have collected a dataset from "kaggle" that consists of 13 different parameters connected to heart disease. Machine learning methods such Random Forest, Logistic Regression, Naive Bayes, and Decision Tree have been used in the model's design. With the aid of conventional machine learning techniques, we also attempted to identify correlations between the various features present in the dataset with the purpose of effectively predicting the risk of heart disease. The results demonstrate that Random Forest provides better prediction accuracy in less time than other ML approaches.