Abstract:
Gestational diabetes mellitus (GDM) is a significant health concern affecting maternal
and fetal well-being, necessitating early and accurate predictive models. This study
presents a novel hybrid machine learning model integrating Random Forest, Support
Vector Machine, and Gradient Boosting Machine through a stacking ensemble approach.
The hybrid model achieved superior performance across two datasets, with accuracy
scores of 92.7% and 89.02%, significantly outperforming individual models. The
integration of diverse data sources, including clinical, biochemical, and demographic
variables, enhanced the model's robustness and generalizability. Metrics such as precision
(91.5% and 86.05%), F1-Score (92.3% and 73.18%), and ROC-AUC (0.94 and 0.91)
underscore the model's ability to balance precision and recall effectively.
The study addresses key research gaps, including generalizability issues, data integration,
and scalability. By incorporating hyperparameter tuning, model pruning, and
quantization, the hybrid model is optimized for deployment in resource-constrained
settings, demonstrating scalability and efficiency. Despite its promise, challenges such as
the need for external validation across diverse populations and addressing biases in
training data remain. Future research should focus on fairness-aware algorithms and
longitudinal studies to ensure equitable healthcare outcomes.
This hybrid model showcases its potential as a reliable tool for early GDM detection,
enabling timely interventions and improving maternal and fetal health outcomes. Its
integration into clinical workflows and adaptability across healthcare settings highlight
its significance as a step forward in precision medicine.