Abstract:
Maintaining consistent leather quality is essential for enhancing product value, minimizing waste, and complying with international standards. Traditional inspection methods often depend on subjective expert judgment, resulting in inconsistencies and slow processing times. This study presents a deep learning approach for the automated detection and classification of leather defects in animals, focusing on four categories: cuts, folds, scratches, and normal leather. We created a balanced dataset from tanneries in Dhaka, which included 1,600 original images alongside an augmented set of 6,400 images. These images were captured using an iPhone 16 under various lighting conditions to accurately reflect real-world inspection scenarios. We fine-tuned five state-of-the-art architectures: Xception, InceptionResNetV2, LeViT, MaxViT, and MobileViT, evaluating their performance using metrics such as accuracy, precision, recall, F1 score, and Matthews correlation coefficient (MCC). Data augmentation techniques, including rotation, flipping, and color jitter, significantly improved accuracy: an increase of 1.42% for cow leather, 2.15% for goat leather, 1.68% for sheep leather, and 2.21% for buffalo leather. Among the models, MaxViT showed the best performance after augmentation, while MobileViT achieved competitive accuracy with greater computational efficiency, making it ideal for resource-limited environments. To enhance model transparency, we incorporated explainability through GradCAM heatmaps, which allowed for defect localization. Finally, we developed a Flask-based web application for real-time defect classification, complete with visual support. The findings underscore that targeted data augmentation improves classification robustness, presenting an effective solution for leather quality control in the industry...