Early prediction and detection of diabetes mellitus using various machine learning approaches

Shopnil, Nafis Rayat

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Project Report
→
View Item

dc.contributor.author	Shopnil, Nafis Rayat
dc.date.accessioned	2025-09-14T07:45:14Z
dc.date.available	2025-09-14T07:45:14Z
dc.date.issued	2024-07-13
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/14522
dc.description	Project Report	en_US
dc.description.abstract	Excess blood glucose levels are indicative of diabetes mellitus (DM), a chronic metabolic disease. Improving patient outcomes and minimizing complications from diabetes need early detection and care of the condition. In this work, we suggest a dataset-based machine learning method for the early identification of diabetes. The dataset gets divided into training and testing sets, missing value management, and feature scaling are among the preparatory procedures that it goes through. After then, each algorithm is trained on the data that has been processed, and cross-validation methods are used to evaluate its performance. We investigate the effectiveness of various machine learning strategies algorithms perform in categorizing people as either diabetes or non-diabetic depending on their clinical and demographic characteristics using Random Forest (RF) and Extreme Gradient Boosting (XGB), Logistic Regression (LR) and Gradient Boosting (GB). Using an ensemble approach called Random Forest, many decision trees are combined to decrease overfitting and increase forecast accuracy. Another ensemble approach, gradient boosting, improves model performance by building trees one after the other to fix mistakes in the prior ones. The statistical model known as logistic regression is useful for classification jobs because it calculates the likelihood of a binary result. The Support Vector Classifier builds hyperplanes to divide various classes and is well-known for its efficiency in high dimensional domains. Random Forest method performed best with an 85% accuracy and f1 score 0.86. The suggested machine learning approach exhibits encouraging outcomes for diabetes early detection, which may help medical professionals identify those who are at risk.	en_US
dc.description.sponsorship	DIU	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Diabetes Mellitus	en_US
dc.subject	Early Prediction	en_US
dc.subject	Disease Detection	en_US
dc.subject	Machine Learning	en_US
dc.title	Early prediction and detection of diabetes mellitus using various machine learning approaches	en_US
dc.type	Other	en_US