DSpace Repository

Sequence-Based Prediction of Amyloid Proteins Using a Hybrid CNN- GRU Deep Learning Architecture

Show simple item record

dc.contributor.author Anu, Muhsana Saima
dc.date.accessioned 2026-04-25T09:38:04Z
dc.date.available 2026-04-25T09:38:04Z
dc.date.issued 2025-12-27
dc.identifier.citation SWT en_US
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17041
dc.description Thesis Report en_US
dc.description.abstract Amyloid fibrils formed by misfolded proteins are central to the pathology of several neurodegenerative disorders, including Alzheimer’s and Parkinson’s disease. Reliable in silico prediction of amyloidogenic proteins and peptides can greatly reduce experimental burden and guide mechanistic studies. Existing computational tools are dominated by hand-crafted sequence descriptors coupled with shallow machine-learning classifiers or ensemble models. While these approaches have achieved high accuracy, they often struggle to capture long-range residue dependencies and contextual patterns that underlie aggregation propensity.This study proposes iAmyloid_PepCG, a sequence-based predictor that integrates multiple engineered features with a hybrid Convolutional Neural Network–Gated Recurrent Unit (CNN–GRU) architecture. Protein/peptide sequences were collected from publicly available benchmark datasets and encoded into a diverse feature space including amino-acid composition, composition–transition– distribution (CTD/CTDC/CTDD), dipeptide composition, pseudo amino-acid composition, physicochemical property (PCP) vectors, and contextual embeddings from transformer models (ESM, ProtBERT, ProtALBERT). A two-stage evaluation was performed: (i) 10-fold cross-validation on the training set and (ii) assessment on an independent hold-out test set.The proposed hybrid CNN–GRU model (iAmyloid_PepCG) achieved an independent-test accuracy of 95.45%, sensitivity of 100%, F1- score of 0.9333, Matthews correlation coefficient (MCC) of 0.9037, Cohen’s kappa of 0.8991, and area under the ROC curve (AUC) of 0.9714, outperforming classical ML baselines and several state-of-the- art amyloid predictors on the same benchmarks.Cross-validation accuracy reached 78.18% with an AUC of 0.8861, indicating stable generalisation.These findings demonstrate that combining local pattern extraction by CNN with long-range dependency modelling by GRU, applied to a rich multi- view feature representation, yields a powerful framework for amyloid protein prediction. en_US
dc.description.sponsorship DIU en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Amyloid Protein Prediction en_US
dc.subject Sequence-Based Analysis en_US
dc.subject Hybrid CNN-GRU Model en_US
dc.subject Deep Learning in Bioinformatic en_US
dc.title Sequence-Based Prediction of Amyloid Proteins Using a Hybrid CNN- GRU Deep Learning Architecture en_US
dc.type Software en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account