Recognizing Emotion from Speech using Machine learning and  Deep learning

Roy, Tonny

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Project Report
→
View Item

Recognizing Emotion from Speech using Machine learning and Deep learning

Roy, Tonny

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17296

Date: 2025-01-13

Abstract:

In the analysis of psychological disorders, behavioral decision making, human machine interaction application speech recognition is plays a essential role. Speech emotion recognition is a system that detects emotions from live audio. people from all over the world utilize words to express their emotions, regardless of their origin. In this project, we focus on using machine learning (ML), which employs a dataset and algorithms to predict or detect any future possibilities. The data sets of audio files in wave format with 8 emotional states: anger, disgust, fear, happiness, pleasant, surprise, sadness, and neutral. Using the librosa library, features were extracted from the audio files in the datasets. The features were applied to multiple machine learning models and results were compared. Speech Emotion Recognition is a popular study topic with numerous applications. It has also became a challenge in the field of speech recognition processing too. Overall, a CNN model would be a good method to human speech emotion recognition with the accuracy rate 85%, because of its capacity to extract complicated patterns and characteristics from input data. The other two models accuracy rates are, SVM 82% and MLP 83%. However, the model's success would be determined by the quality of the preprocessed data, the model architecture used, and the efficacy of the data augmentation strategies employed