DSpace Repository

Adaptive Feature Selection and Classification of Colon Cancer from Gene Expression Data

Show simple item record

dc.contributor.author Islam, Ashraful
dc.contributor.author Rahman, Mohammad Masudur
dc.contributor.author Ahmed, Eshtiak
dc.contributor.author Arafat, Faisal
dc.contributor.author Rabby, Md Fazle
dc.date.accessioned 2021-12-11T10:33:23Z
dc.date.available 2021-12-11T10:33:23Z
dc.date.issued 2020-01
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/6544
dc.description.abstract Cancer research is one of the major and significant areas in medical research. A substantial number of research has been performed in this area and several methods have been employed. However, accuracy of cancer prediction is yet to reach near perfection as the conventional classification methods have several limitations. In recent times, microarray processed gene expression data has been used to predict cancer with significant accuracy. The gene expression data are usually high dimensional and comprises of relatively small number of samples which makes them difficult to classify. In order to achieve higher accuracy, ensembles method can be deployed which combines multiple classification methods. In this study, we have used the public colon cancer gene expression data set that consists of 62 instances having 2,000 attributes. An adaptive pre-processing procedure has been conducted including Linear Discriminant Analysis (LDA) and Principle Component Analysis (PCA) to cope up with the high dimensionality of the data. This was followed by building an ensemble learning model with k-Nearest Neighbors (kNN), Random Forest (RF), Kernel Support Vector Machines (KSVM), eXtreme Gradient Boosting (XGBoost), and Bayes Generalized Linear Model (GLM). Comparing with other classifiers, this study offers a significant improvement as our ensemble learning model gives higher accuracy than previously employed classification techniques. Thus the obtained accuracy is 91.67% with the scores 0.75, 1.00 and 0.85 of precision, recall and Matthews correlation coefficient (MCC) values respectively. en_US
dc.language.iso en_US en_US
dc.publisher ACM International Conference Proceeding Series en_US
dc.subject Colon Cancer en_US
dc.subject Gene expression data en_US
dc.title Adaptive Feature Selection and Classification of Colon Cancer from Gene Expression Data en_US
dc.title.alternative an Ensemble Learning Approach en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics