Abstract:
Heart disease and other cardiovascular disorders have surpassed all others as the leading
cause of mortality worldwide during the last several decades. Given the many potential
causes of heart disease, it is essential to develop effective, efficient methods for making
an early diagnosis and taking prompt action to treat the illness. In the healthcare industry,
data mining has become more popular as a method for evaluating massive datasets.
Researchers use a variety of machine learning and data mining approaches to analyze
large, complicated medical datasets to help healthcare practitioners in making heart
illness predictions. This research proposes a model that makes use of various supervised
learning methods, including the Decision Tree, the Random Forest, the K-Nearest
Neighbor, the XG Booster, the Support Vector Machine, the Gaussian Naive Bayes, the
Bernays Naive Bayes, and the Logistic Regression, as well as two hyper-parameter
optimization strategies, the Grid Search CV and the Randomized Search CV, and three
feature selection strategies, the Univariant selection, the Model It makes use of the
preexisting UCI collection of people with heart illness. Keeping score in Cleveland. The
dataset has 1025 samples with 14 different characteristics. All of these are essential for
the proper operation of different algorithms. The goal of this research is to determine how
likely it is that participants will develop heart disease. The findings suggest that the
Univariant selection method provides the maximum reliable outcomes.