Résumé:
Voice-based age estimation is an emerging field of study with significant
applications in biometric security, healthcare, and personalized services. Our work focuses
on the development and evaluation of a Long Short-Term Memory (LSTM) based solution
trained on the Common Voice dataset, specifically targeting English-speaking demographics
across various regions.
This study's primary focus is providing accurate age estimates from voice data. Our
model extracts spectral features and Mel Frequency Cepstral Coefficients (MFCCs) from
voice samples, taking into consideration the gender and accent of the speaker to better
estimate the age. To tackle this problem, we implemented our solution using Python in the
Jupyter Notebook environment, employing tools such as Keras for model creation and
Librosa for sound processing. The results are very encouraging.