IDTi-CSsmoteB

Mahmud, S. M. Hasan; Chen, Wenyu; Jahan, Hosney; Liu, Yongsheng; Sujan, Nasir Islam; Ahmed, Saeed

DSpace Home
→
DIU Faculty Publication
→
Articles
→
View Item

dc.contributor.author	Mahmud, S. M. Hasan
dc.contributor.author	Chen, Wenyu
dc.contributor.author	Jahan, Hosney
dc.contributor.author	Liu, Yongsheng
dc.contributor.author	Sujan, Nasir Islam
dc.contributor.author	Ahmed, Saeed
dc.date.accessioned	2022-03-06T04:16:47Z
dc.date.available	2022-03-06T04:16:47Z
dc.date.issued	2019-04-11
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/7438
dc.description.abstract	Identifying interaction between drug and protein is a crucial challenge in drug discovery, which can lead the researchers to develop novel drug compounds or new target proteins for the existing drugs. The determination of drug-target interactions (DTIs) is an extremely time-consuming, costly, and tedious task with wet-lab experiments. To date, multiple computational techniques have been presented to simplify the drug discovery process, but a huge number of interactions are still undiscovered. Furthermore, a class imbalance is a critical challenge regarding this experiment which can significantly degrade the classification accuracy that has not been effectively addressed yet. In this paper, we proposed a novel high-throughput computational model, called iDTi-CSsmoteB, for identification of DTIs based on drug chemical structures and protein sequences. More specifically, the protein sequence is extracted through position-specific scoring matrix (PSSM)-Bigram, amphiphilic pseudo amino acid composition (AM-PseAAC) and dipeptide PseAAC descriptors which represents evolutionary and sequence information. The drug chemical structure is represented as a molecular substructure fingerprint (MSF) which describes the existence of the functional fragments or groups. Finally, we used the over-sampling SMOTE technique to overcome the imbalance issue of the datasets and applied XGBoost algorithm as a classifier to predict DTIs. To evaluate the performance of iDTi-CSsmoteB, several experiments have been conducted on four benchmark datasets, namely, enzyme, ion channel, GPCR, and nuclear receptor based on fivefold cross validation. The experimental analysis exhibits that our model outperforms similar methods in terms of area under the ROC (auROC) curve. In addition, our achieved results indicate the effectiveness of the feature extraction techniques, balancing methods, and classifier for predicting the DTIs which can provide substance for new drug development. iDTi-CSsmoteB webserver is available online at http://idticssmoteb-uestc.me/	en_US
dc.language.iso	en_US	en_US
dc.publisher	IEEE Access	en_US
dc.subject	Drugs	en_US
dc.subject	Chemicals	en_US
dc.subject	Feature extraction	en_US
dc.subject	Protein sequence	en_US
dc.subject	Predictive models	en_US
dc.subject	Standards	en_US
dc.title	IDTi-CSsmoteB	en_US
dc.title.alternative	Identification of Drug-target Interaction Based on Drug Chemical Structure and Protein Sequence Using XGBoost with Over-sampling Technique Smote	en_US
dc.type	Article	en_US