Abstract:
Heart disease is among the leading causes for death globally. Thus, early
identification and treatment are indispensable to prevent the disease. In this
work, we propose a framework based on machine learning algorithms to
tackle such problems through the identification of risk variables associated
to this disease. To ensure the success of our proposed model, influential data
pre-processing and data transformation strategies are used to generate
accurate data for the training model that utilizes the five most popular
datasets (Hungarian, Stat log, Switzerland, Long Beach VA, and Cleveland)
from UCI. The univariate feature selection technique is applied to identify
essential features and during the training phase, classifiers, namely extreme
gradient boosting (XGBoost), support vector machine (SVM), random forest
(RF), gradient boosting (GB), and decision tree (DT), are deployed.
Subsequently, various performance evaluations are measured to demonstrate
accurate predictions using the introduced algorithms. The inclusion of
Univariate results indicated that the DT classifier achieves a comparatively
higher accuracy of around 97.75% than others. Thus, a machine learning
approach is recognize, that can predict heart disease with high accuracy.
Furthermore, the 10 attributes chosen are used to analyze the model's
outcomes explain ability, indicating which attributes are more significant in
the model's outcome.