DSpace Repository

Machine Learning-based Statistical Analysis for Early Stage Detection of Cervical Cancer

Show simple item record

dc.contributor.author Ali, Md Mamun
dc.contributor.author Ahmed, Kawsar
dc.contributor.author Bui, Francis M.
dc.contributor.author Paul, Bikash Kumar
dc.contributor.author Ibrahim, Sobhy M.
dc.contributor.author Quinn, Julian M.W.
dc.contributor.author Moni, Mohammad Ali
dc.date.accessioned 2022-03-12T09:53:05Z
dc.date.available 2022-03-12T09:53:05Z
dc.date.issued 2021-12
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/7488
dc.description.abstract Cervical cancer (CC) is the most common type of cancer in women and remains a significant cause of mortality, particularly in less developed countries, although it can be effectively treated if detected at an early stage. This study aimed to find efficient machine-learning-based classifying models to detect early stage CC using clinical data. We obtained a Kaggle data repository CC dataset which contained four classes of attributes including biopsy, cytology, Hinselmann, and Schiller. This dataset was split into four categories based on these class attributes. Three feature transformation methods, including log, sine function, and Z-score were applied to these datasets. Several supervised machine learning algorithms were assessed for their performance in classification. A Random Tree (RT) algorithm provided the best classification accuracy for the biopsy (98.33%) and cytology (98.65%) data, whereas Random Forest (RF) and Instance-Based K-nearest neighbor (IBk) provided the best performance for Hinselmann (99.16%), and Schiller (98.58%) respectively. Among the feature transformation methods, logarithmic gave the best performance for biopsy datasets whereas sine function was superior for cytology. Both logarithmic and sine functions performed the best for the Hinselmann dataset, while Z-score was best for the Schiller dataset. Various Feature Selection Techniques (FST) methods were applied to the transformed datasets to identify and prioritize important risk factors. The outcomes of this study indicate that appropriate system design and tuning, machine learning methods and classification are able to detect CC accurately and efficiently in its early stages using clinical data. en_US
dc.language.iso en_US en_US
dc.publisher Computers in Biology and Medicine, Elsevier en_US
dc.subject Cervical cancer en_US
dc.subject Biopsy en_US
dc.subject Cytology en_US
dc.subject Hinselmann en_US
dc.subject Schiller en_US
dc.subject Random tree en_US
dc.title Machine Learning-based Statistical Analysis for Early Stage Detection of Cervical Cancer en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics