Abstract:
The declaration emphasizes the increasing prevalence of physical illnesses, especially
cardiovascular disease, and the demand for efficient methods to forecast the illness and
increase early detection. The utilization of effective algorithm models is recommended by
this study in order to predict the risks related to cardiovascular disease. The three datasets
used in the study were obtained from the Kaggle website and the UCI machine learning
library. Several algorithm models were used in this work, including Gradient Boosting
(GB), K-Neighbors Classifier (KNN), XGB Classifier (XGB), Random Forest (RF),
Logistic Regression (LR), and Decision Tree (DT). To assess the performances, ensemble
models including Bagging, Boosting, Random Subspace, Stacking, and Voting were also
used. For the Hungarian dataset, the conventional model KNN earned the highest accuracy
with 80.26%, the LR model achieved the greatest accuracy with 88.52%, and the DT model
reached the best accuracy with 72.99%. For these three datasets, we also used five distinct
kinds of ensemble methods. The findings showed that for the Hungarian dataset, KNN had
the greatest accuracy at 81.26%, for the Cleveland dataset, Bagged GB had the best
accuracy at 91.8%, and for the Cardio dataset, GB had the best accuracy at 73.11%. By
applying hyperparameter tuning, the best parameters were assigned to each classifier,
resulting in more precise cardiovascular disease predictions. Comparing the experimental
examination to earlier research, performance was better, with the ensemble model
obtaining the greatest accuracy of 91.8%. The study emphasizes how crucial it is to use
ensemble methods and sophisticated algorithm models in order to forecast cardiovascular
illness properly and enhance real-world results.