Identifying the research field of a scientific paper from the abstract using deep learning approaches

Sarker, Sudipto

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

Identifying the research field of a scientific paper from the abstract using deep learning approaches

Sarker, Sudipto

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/13155

Date: 2024-01-25

Abstract:

This study explores the application of deep learning to automate the identification of research fields within scientific paper abstracts. The goal is to create a resilient model that effectively categorizes the primary subject matter discussed in abstracts, enhancing precision and efficiency. The dataset undergoes preprocessing, tokenization, and transformation into sequences suitable for input into various models, including Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Bidirectional Long Short-Term Memory (BLSTM) cells. The trained model incorporates techniques such as word embedding and dropout, and its performance is evaluated using metrics like accuracy and the AUC-ROC score. The research addresses challenges in identifying research fields within English language abstracts, employing language-specific preprocessing and data augmentation. The results highlight the efficacy of deep learning in accurately categorizing diverse research fields within English abstracts, showcasing its potential applicability beyond English contexts. The findings contribute to advancing automated techniques for recognizing research themes, streamlining the comprehension and classification of scientific papers. Various algorithms, including ANN, CNN, BLSTM, DT, GB, ABC, RF, SVC, XGB, MNB, PA, RC, and LR were employed. Notably, the Gradient Boosting (GB) the model demonstrated exceptional performance with an 83.82% accuracy rate, and the Support Vector Classification (SVC) yielded impressive results with an 83.50% accuracy rate. These outcomes were achieved through meticulous hyperparameter tuning, enhancing the overall robustness of the model.

Show full item record