Abstract:
The rapid development of artificial intelligence (AI) in the field of medical imaging has
presented unprecedented opportunities to diagnose with automated accuracy and clinical
decision support, but serious challenges still lie in the development of systems that are
both accurate and clinically reliable. In this paper we provide an AI-based classification of
Chest X-rays and automated radiology report generation, which integrates the latest deep
learning models with large language models (LLMs) to generate reliable, interpretable,
and clinically meaningful results. The methodology is planned in four major steps. To
optimize the predictive accuracy and generalizability with a wide range of imaging
conditions, first, fine-tuned architectures such as InceptionV3, ResNet, and VGG are used
to classify chest X-rays as COVID-19, pneumonia, or no diseases using transfer learning,
adaptive training strategies, and large amounts of data augmentation. Second, Grad-CAM
and Grad-CAM++ visualizations are used to draw attention to the most interesting parts
of the photos, improving interpretability, visually explaining them, and helping clinicians
trust the model more with the knowledge that it focuses attention on pertinent anatomical
objects. Third, the classification outputs and the XAI heatmaps are fed into a domainadapted LLM (Gemini-Flash 2.5), which can produce structured, radiology-style reports
that are clinically coherent, readable, and consistent with standard reporting forms.
Lastly, evaluation is a combination of quantitative measures, such as accuracy, precision,
recall, and F1-score, and expert-in-the-loop validation, where radiologists evaluate the
generated reports based on whether they contain factual correctness, structural quality,
clinical completeness, interpretability, and utility. The results of the experiment show that
InceptionV3 obtained the best results (AUROC = 0.9988, Accuracy = 97.7%), which were
better than VGG16, VGG19, ResNet50, and ResNet101. Comparative analysis shows that
the proposed system yields not just highly accurate predictions, but also clinically
consistent and interpretable reports, which are essential to solve issues with
hallucinations, cross-modal misalignment, and limited grounding found in previous
research. Expert validation helps to underline the practical reliability and safety of the
system and justify its possibility to be applied in practice in radiology processes, decrease
the amount of diagnostic work, and accelerate, standardize, and streamline to a greater
degree the process of patient care in clinical practice.