DSpace Repository

Predicting Absenteeism of Employees at Workplace Using Tree-Based Algorithms

Show simple item record

dc.contributor.author WAHID, ZAMAN
dc.date.accessioned 2019-07-16T09:34:22Z
dc.date.available 2019-07-16T09:34:22Z
dc.date.issued 2018-12-24
dc.identifier.uri http://hdl.handle.net/123456789/2965
dc.description.abstract Absenteeism at workplace plays a crucial factor in demonstrating the productive and profitable capacity of a company. Thus the knowledge of absenteeism of employees’ becomes the principle for an organization in its multiple dimensions. Because the proper determination of employees’ profile allows the identification of excesses of occurrences of certain morbidities. The early absenteeism research primarily focused on predicting the characteristics and the categories of diseases of employees that make them perform higher absenteeism at workplace. However, predicting the absenteeism time of employees using tree-based machine learning classifiers and thus finding out the facts that should be taken into account to abate higher absenteeism at workplace are yet to be explored. In this thesis, we have applied three prominent machine learning algorithms namely Decision Tree, Gradient Boosted Tree, and Random Forest to predict absenteeism time of employees and to find out the insights that cause employees to perform higher absenteeism at work. Meanwhile comparing the different machine learning algorithms to find out the best classifier which produces the highest prediction accuracy. We have used an existing dataset of a courier company in Brazil in order to predict the absenteeism time of employees. The dataset contains 21 categories of the reason for absence which are attested by the International Classification of Disease (ICD) and 7 other categories without the ICD that have proved to be effective in detecting the absenteeism at work. We classified the absenteeism time into four categories such as NOT ABSENT, HOURS, DAYS, and WEEKS. Based on the seven evaluation metrics such as True Positive, True Negative, False Positive, False Negative, Sensitivity, Specificity, and Accuracy we have evaluated the model performance in predicting absenteeism at work. Our comparative analysis found that Gradient Boosted Tree produces the best result with an accuracy rate of 84.46% whereas Decision Tree performed the lowest with the accuracy rate of 80.41%. The Random Forest classifier performs in between with an accuracy rate of 82.43%. Using the tree model we discovered that the reason for absence class as diseases that are attested by International Code of Diseases (ICD), and the transportation expense from home to work are the topmost facts of performing higher absenteeism at workplace. en_US
dc.language.iso en en_US
dc.publisher Daffodil International University en_US
dc.relation.ispartofseries ;P12403
dc.subject Machine Learning en_US
dc.subject Computer Science en_US
dc.subject Absenteeism en_US
dc.subject Classification en_US
dc.title Predicting Absenteeism of Employees at Workplace Using Tree-Based Algorithms en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account