Abstract:
Heart failure (HF) is currently the leading cause of morbidity and mortality worldwide. Diagnosis of a medical condition is difficult and time-consuming in medical science. Whereas Machine learning (ML) techniques can help reduce HF’s mortality rate by providing early warnings. It would be more promising and accurate when we have significant data and features. In this paper, we incorporate different ML methods with significant features which can serve as warnings at the early stages. Initially, general preprocessing techniques are applied in the Kaggle heart failure dataset and introduce the SMOTETOMEK-BOOST method for handling imbalanced class problems. Then two well-known feature selection techniques Feature Importance by Random Forest and Information Gain are applied purpose of reducing the dimensions of the data and selecting the most significant features. All different feature sets are trained with Decision Tree (DT), Extra Tree (ET), Gradient Boost (GB), and Support Vector Machine (SVM), along with presenting a hybrid classifier named CBCEC by combining the best-performing classifier with two ensemble methods. Experimental results demonstrate that the proposed CBCEC model performs the highest results of 93.67% accuracy with Feature Importance (FI) based feature selection. Finally, explain the global behaviors of the best-performing features set by applying an explainable method named the Partial Dependence Plot (PDP).