Abstract:
This study aims to forecast the depression level of the students studying at Daffodil
International University (DIU) based on certain factors employing the machine learning
model. To gather data, the Patient Health Questionnaire-9 (PHQ-9) was administered and
captured information regarding different symptoms of depression as well as their
consequences in one’s daily life The dataset employed in this study contains 1027 records
and 13 attributes. The target attribute, "Depression," is categorized into five classes: The
disease impact categories include: “None-minimal,” “Mild,” “Moderate,” “Moderately
Severe,” and “Severe” The data preparation process involved: missing value management,
label encoding, and scaling transformation. Since the techniques were exploratory,
Exploratory Data Analysis (EDA) was performed to reveal relationships and trends. The
following algorithms of machine learning were involved: Gaussian Naive Bayes, Decision
Tree Classifier, Voting Classifier, AdaBoost Classifier, and Support Vector Classifier. In
the analysis of each model’s performance, we have used accuracy, precision, recall as well
as F1 score. Of all the classifiers built, the Voting Classifier that integrates the results of
the different classifiers gave the highest prediction accuracy of. These findings suggest
that by using machine learning, step can be taken to identify students who are most at risk
of different levels of depression and ensure that steps are taken to assist such students.