Abstract:
This research develops an extensive deep learning method for detecting human emotions from images that achieves interpretability through Explainable Artificial Intelligence (XAI). The research method starts with emotion-tagged facial picture acquisition which leads towards in-depth preprocessing steps to optimize image quality for deep learning applications. The processed dataset gets divided into separate training and validation and testing parts to perform reliable model assessment. The research utilizes ResNet50 together with CNN and EfficientNetB2 and DenseNet169 to examine their effectiveness in emotion classification. The deep feature extraction capabilities of ResNet50 yielded the optimal performance with an accuracy level of 92.33%. The performance disparity resulted in DenseNet169 achieving only 51.43% accuracy because its architecture did not match the dataset characteristics or because of overfitting. XAI techniques assist the study by implementing transparency functions that help explain how models reach their decisions. The display of system processes matters most in emotional state monitoring software because it enhances user trust while maintaining application accountability. Analysis reveals how ResNet50 works as an effective network architecture yet also establishes the necessity of choosing network structures that match particular tasks. Including XAI enhances the ethical structure of AI systems by resolving the hidden operations within deep learning methods. Our top performing model has been deployed online, and we are presently analyzing the output generated by Visual Emotion. Future work will study real-time emotion perception alongside multi-modal data connection of audio with text content in addition to the practical deployment of systems through an emphasis on privacy protection alongside fair usage and ethical treatment.