Abstract:
Diabetes is an extremely prevalent health concern worldwide and early prediction is the key to control it. In this paper, we introduce a hybrid machine learning model called DiaGuard to predict diabetes using patient’s health conditions. Structing is based on the composition of the base models Random Forest and Support Vector Machine (SVM) and final estimator Logistic Regression to improve prediction performance. Patterns such as glucose, BMI, age, blood pressure are available in the dataset. As shown in Table, DiaGuard performed better than all of these models for accuracy, precision, recall and F1- score when used only one of these classical machine learning models (Logistic Regression, Random Forest and SVM) for the purpose of generating numeric over on weak supervision manner. The hybrid DiaGuard model was reported to generate an accuracy of 98.9% in testing set with a precision of 0.99 and recall of 0.98, indicating its excellent capability in diabetes prediction [14]. The hybrid method can well take the linear and non-linear relationship between parameters in the data into account so that an excellent performance is achieved on novel data. The paper also highlights the efficacy of ensemble models in improving predictive reliability and generalizability. DiaGuard here offers a prospective solution for early diabetes diagnosis that could enable healthcare providers to make more informed decisions on timely treatment options. It will be exciting and interesting for our group to apply the model on larger diverse datasets in future, as well as to study its potential in real- time healthcare applications. In addition, considering the combination of deep learning models may improve the prediction ability, and research on explainable AI methods would contribute to make predictions more transparent and comprehensible. Finally, DiaGuard contributes to advancing health care with data driven methods.