dc.description.abstract |
Gender identification presents a notable challenge in signal processing, with recent
attention shifting towards vocal feature analysis over traditional image classification
methods. Acknowledging that gender classification encompasses more than fundamental
frequency and pitch, the research explores the importance of feature selection, akin to
dimensionality reduction, crucial for identifying gender-specific traits. This study delves
into the efficacy and significance of machine learning algorithms in addressing voice-based
gender identification, examining the relationship between vocal fold thickness,
wavelength, and pitch perception, particularly in distinguishing male and female voices.
Leveraging machine learning techniques, the feasibility of gender identification from voice
signals is demonstrated, employing methods such as Discrete Fourier Transform, Mel-
spaced filter-bank, and log filter-bank energies to extract MFCC features. Gender
identification from natural voice holds considerable implications, especially in practical
applications where gender differentiation is vital. While conventional voice-to-text
conversion may not require gender detection, real-world scenarios demand it, aligning with
Natural Language Processing, a subset of artificial intelligence. The methodology entails a
systematic workflow, encompassing input audio files, pre-processing, feature extraction,
model training, and testing with separate datasets. The study yields a remarkable dataset of
2384 samples from more than 50 speakers, including 607 male,663 child,451 third gender,
and 663 female voices. This research underscores machine learning's potential in
addressing gender identification from voice, emphasizing its significance across various
artificial intelligence applications. Our dataset comprises individuals aged 0 to 42+ from
diverse locations, meticulously categorized into age groups and gender categories.
Utilizing a smartphone and audio recording software, we collect a comprehensive database,
strategically divided into training and testing sets. Notably, SVM and Random Forest
emerge as top performers, achieving accuracies of 96.28% and 97.72%, respectively. |
en_US |