Abstract:
Lung cancer remains one of the leading causes of cancer-related mortality worldwide. This paper presents a comprehensive AI-driven framework for early and accurate detection of lung cancer using the LIDC-IDRI dataset, integrating explainable AI (XAI) techniques and large language model (LLM)-generated clinical narratives to enhance trust and interpretability. The proposed system preprocesses DICOM series and XML annotations to generate pseudo-3D inputs from three adjacent CT slices centered on radiologist-annotated nodules, storing malignancy scores as averaged floating-point values. Three deep learning models — EfficientNetV2-S, DenseNet201, and MobileViTXXS — are trained using 5-fold stratified cross-validation with binary cross-entropy loss and label smoothing. A Multi-Attention Stacked Ensemble (MASE) fuses base model predictions for improved performance. Grad-CAM explanations are generated per model and aggregated for robust visualization, while an LLM transforms model outputs and CAM data into concise, radiologist-style justifications.