Abstract:
Cardiovascular diseases (CVDs) remain a significant public health burden globally
and coronary artery disease (CAD) is a major cause of heart attacks. In this paper,
we explore the use of machine learning (ML) methods as tools for improving the
early detection of CVD. Fifteen supervised ML and deep learning algorithms were
used for the study in python 3.10. Model training and evaluation were performed by
Scikit-learn, Pandas was used for descriptive statistical analysis and plots were
generated with Matplotlib and Seaborn. The dataset utilized, which includes 920
instances, is a merged information from four different locations (Cleveland,
Hungary, Switzerland and VA Long Beach) obtained from the UCI Machine
Learning Repository. The Histogram-based Gradient Boosting Classifier performed
best among the models tested, with an accuracy of 93.5 % and a 5-fold crossvalidation accuracy of 92.4 %. It also obtained high scores in precision, recall, and F1
measurements for both classes and was able to classify people with and without the
heart disease. These results highlight the promise of ML for early CVD detection
and timely clinical interventions.