AI-Powered Radiology Report Generation Using Multi-Model

Roy, Ridam

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

AI-Powered Radiology Report Generation Using Multi-Model

Roy, Ridam

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/16775

Date: 2025-09-17

Abstract:

The rapid development of artificial intelligence (AI) in the field of medical imaging has presented unprecedented opportunities to diagnose with automated accuracy and clinical decision support, but serious challenges still lie in the development of systems that are both accurate and clinically reliable. In this paper we provide an AI-based classification of Chest X-rays and automated radiology report generation, which integrates the latest deep learning models with large language models (LLMs) to generate reliable, interpretable, and clinically meaningful results. The methodology is planned in four major steps. To optimize the predictive accuracy and generalizability with a wide range of imaging conditions, first, fine-tuned architectures such as InceptionV3, ResNet, and VGG are used to classify chest X-rays as COVID-19, pneumonia, or no diseases using transfer learning, adaptive training strategies, and large amounts of data augmentation. Second, Grad-CAM and Grad-CAM++ visualizations are used to draw attention to the most interesting parts of the photos, improving interpretability, visually explaining them, and helping clinicians trust the model more with the knowledge that it focuses attention on pertinent anatomical objects. Third, the classification outputs and the XAI heatmaps are fed into a domainadapted LLM (Gemini-Flash 2.5), which can produce structured, radiology-style reports that are clinically coherent, readable, and consistent with standard reporting forms. Lastly, evaluation is a combination of quantitative measures, such as accuracy, precision, recall, and F1-score, and expert-in-the-loop validation, where radiologists evaluate the generated reports based on whether they contain factual correctness, structural quality, clinical completeness, interpretability, and utility. The results of the experiment show that InceptionV3 obtained the best results (AUROC = 0.9988, Accuracy = 97.7%), which were better than VGG16, VGG19, ResNet50, and ResNet101. Comparative analysis shows that the proposed system yields not just highly accurate predictions, but also clinically consistent and interpretable reports, which are essential to solve issues with hallucinations, cross-modal misalignment, and limited grounding found in previous research. Expert validation helps to underline the practical reliability and safety of the system and justify its possibility to be applied in practice in radiology processes, decrease the amount of diagnostic work, and accelerate, standardize, and streamline to a greater degree the process of patient care in clinical practice.