Abstract:
Diabetes is one of the deadly chronic diseases that occurs when the sugar level in the blood
increases abnormally due to the absence of insulin hormone. Untreated Diabetes could lead
a human to his/her death. In Bangladesh, the threat of Diabetes is really a matter of concern
and people of all ages and gender are suffering equally. My research focused on young
people who are under the age of 36 in Bangladesh. By using Machine Learning I have built
a model which can predict the possibility of having or not having Diabetes. The model that
I have built was trained by previous data of diabetic and non-diabetic patients. These data
were collected from Bangladesh. On experiment, these data were processed and analyzed
by various data pre-processing techniques. Then some classic Machine Learning
algorithms like Logistic Regression, Random Forest, K-Nearest Neighbors and Naive
Bayes were used for building the model and the performance of each of them was measured
using metrics like Prediction Accuracy on the testing and training data, Confusion Matrix,
Sensitivity, Precision, F1 score, Recall, Specificity, ROC and AUC. Overall Random
Forest performed better than others. So, the Random Forest model was chosen for the
prognosis of the disease and to demonstrate the use of the model. For that, I have built a
web and an android application that will take required input data from a user according to
the model and predict whether the user has Diabetes or not.