| dc.description.abstract |
Early and accurate detection of retinal disease is highly important in order to avoid losing sight as well as in the treatment of patients. This study focused to integrating deep learning models with large language models (LLMs) to improve the classification of retinal diseases and diagnostic interpretation. The OCTDL dataset was preprocessed through augmentation, noise removal, sharpening, contrast enhancement, and outlier removal to improve image quality and model generalization. Four state-of-the-art convolutional neural networks (CNNs), Such as MobileNetV2, ResNet50, VGG16, and DenseNet121, were evaluated. Among them, MobileNetV2 outperformed others. To improve performance, we proposed a Fusion MobileNetV2 model, which combines global and local feature extraction using a fusion mechanism. Among 10 folds, the model achieved 100 % training accuracy, 97.27% validation, and 99.57% testing accuracy on 5-fold. We utilize Grad- CAM to visualize the model predictions. The Grad-CAM outputs with predicted classes were validated by GPT-4o, while GPT-4 responded to user questions from its knowledge base using a retrieval-augmented generation (RAG) pipeline. Finally, the overall framework was deployed as a web application, providing an accessible tool for assisting in early retinal disease diagnosis. |
en_US |