Abstract:
Cardiovascular disease (CVD) is a major global health issue with a particularly noticeable impact in Bangladesh. Cardiovascular disease (CVD) is a major global cause of death and disease that affects the heart and blood vessels, with serious consequences such as heart attacks, strokes, and heart failure. Understanding the seriousness of the condition and taking early identification and preventative action are essential to reducing the negative impacts of CVD. In an effort to lower mortality due to CVD, many researchers are attempting to predict CVD using machine learning models. However, very few researchers have actually employed real world data. This study compared several machine learning methods that can reliably predict the occurrence of cardiovascular disease based on actual patient records from Bangladesh's healthcare system. I have collected data for my study from two well-known hospitals in Dhaka, Bangladesh. There are 1019 instances and 9 attributes in my dataset. Six machine learning models have been used: K-Nearest Neighbor, XGBoost, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. After using stratified 5-fold cross validation, my model XGBoost demonstrated improved performance in both training (84.22%) and testing (86.05%). Several performance measurement techniques, including the ROC curve, precision, recall, and F1-score have been used to assess the performance of my models.