Abstract:
Correct and timely identification of Acute Lymphoblastic Leukemia (ALL) using microscopic images of the blood smear is critical in any attempt to make a clinical decision. This thesis presents a developmental evaluation on the frameworks of deep feature extraction and dimensionality reduction schemes in automated classification of Acute Lymphoblastic Leukemia (ALL) from peripheral blood smear images. The experiments were carried out on a publicly available ALL image dataset (CNMC), in which the natural imbalance in the classes was addressed with the help of ADASYN oversampling to ensure a robust learning process. Featuring seven pretrained architectures- five convolutional neural networks (ResNet50, DenseNet121, InceptionV3, Xception, EfficientNetB0 ) and two Vision Transformer architectures (ViTB16 and ViT-L16 ), high-dimensional feature embeddings were generated and compared. The two dimensionality reduction techniques were Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE) with five sizes of features of 1024, 900, 700, 500, and 300, respectively, leading to seventy different possible combinations of features and models. All configurations were evaluated using a multilayer perceptron classifier based on stratified 5-fold cross-validation and the performance was measured using AUC, accuracy, F1-score, recall, precision, and MCC. The level of performance exhibited by PCA had been found to be higher as compared to that of RFE in most extractors. The CNN-based performance was best with DenseNet121-PCA(700), but the discriminative ability of transformer models were significantly higher. ViT-L16-PCA(1024) had the largest AUC of 96.24, which is the best overall performance. To justify the strength of the classifier, additional comparisons were made with Support Vector Machine (SVM), Random Forest, XGBoost, and Logistic Regression classifiers, in which MLP classifier continued to outperform all alternatives. Explainability of the model was assessed by Vision Transformer attention maps confirmed thate model was able to focus on clinically important elements such as chromatin distribution and nuclear boundaries. The results associated with the study imply that global attention of transformers along with the dimensionality reduction of variance is a very promising pipeline of hematological image classification. The proposed model has strong potential to be used in automated systems of diagnostic assistance.