Speech Emotion Recognition Using MFCC and SVM Classification

Authors

  • G.Ramesh Babu
  • W.Mercy Grace
  • T.Siva Sankar Rao
  • V.J.S. Rajkumar
  • S.Ramaraju S.Ramaraju

Keywords:

Speech Emotion Recognition, MFCC Features, CNN-SVM Hybrid Model, Deep Learning, Support Vector Machine, Audio Signal Processing, Librosa Python

Abstract

The analysis of human emotions from speech signals plays a major role in human-computer interaction through Speech Emotion Recognition (SER). The proposed system combines MFCCs as feature extraction elements with a CNN for deep feature learning which is then classified with an SVM to boost emotion recognition accuracy. The audio signal processing together with feature extraction operations rely on the Python-based Librosa library. The proposed approach uses the CNN to extract high-level speech data which SVM then classifies into categories effectively. The evaluation of this proposed method shows higher accuracy when implementing it on benchmark emotional speech datasets surpassing traditional MFCC-SVM systems. Deep learning united with SVM produces applications which suit actual usage through better generalization and more robustness for virtual assistant technology alongside sentiment evaluation and medical diagnostic systems. Experimental findings demonstrate that the model functions with high efficiency when identifying emotional patterns which strengthens its capacity for advanced applications in SER.

Downloads

Download data is not yet available.

References

Rao, K. Sreenivasa, et al. "Emotion recognition from speech." International Journal of Computer Science and Information Technologies 3.2 (2012): 3603-3607.

Yu, Feng, et al. "Emotion detection from speech to enrich multimedia content." Pacific-Rim Conference on Multimedia. Springer, Berlin, Heidelberg, 2001.

Pfister, Tomas. "Emotion Detection from Speech." 2010.

Sapra, Ankur, Nikhil Panwar, and Sohan Panwar. "Emotion recognition from speech." International journal of emerging technology and advanced engineering 3 (2013): 341-345.

Utane, Akshay S., and S. L. Nalbalwar. "Emotion recognition through Speech." International Journal of Applied Information Syatems (IJAIS) (2013): 5-8.

El Ayadi, Moataz, Mohamed S. Kamel, and Fakhri Karray. "Survey on speech emotion recognition: Features, classification schemes, and databases." Pattern Recognition44.3 (2011): 572-587.

Kim, Samuel, et al. "Real-time emotion detection system using speech: Multi-modal fusion of different timescale features." Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on. IEEE, 2007.

Farouk, Mohamed Hesham. "Emotion Recognition from Speech." Application of Wavelets in Speech Processing. Springer, Cham, 2018. 51-55.

Schuller, Björn, Gerhard Rigoll, and Manfred Lang. "Hidden Markov model-based speech emotion recognition." Multimedia and Expo, 2003. ICME'03. Proceedings. 2003 International Conference on. Vol. 1. IEEE, 2003.

Kwon, Oh-Wook, et al. "Emotion recognition by speech signals." Eighth European Conference on Speech Communication and Technology. 2003.

Wendemuth, Andreas, et al. "Emotion Recognition from Speech." Companion Technology. Springer, Cham, 2017. 409-428.

Schuller, Björn, Gerhard Rigoll, and Manfred Lang. "Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture." Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP'04). IEEE International Conference on. Vol. 1. IEEE, 2004.

Nwe, Tin Lay, Say Wei Foo, and Liyanage C. De Silva. "Speech emotion recognition using hidden Markov models." Speech communication 41.4 (2003): 603-623.

Busso, Carlos, et al. "Iterative feature normalization scheme for automatic emotion detection from speech." IEEE transactions on affective computing 4.4 (2013): 386-397.

Sethu, Vidhyasaharan, Eliathamby Ambikairajah, and Julien Epps. "Speaker normalisation for speech-based emotion detection." Digital Signal Processing, 2007 15th International Conference on. IEEE, 2007.

Downloads

Published

2025-05-19

How to Cite

1.
Babu G, Grace W, Rao TS, Rajkumar V, S.Ramaraju S. Speech Emotion Recognition Using MFCC and SVM Classification. J Neonatal Surg [Internet]. 2025May19 [cited 2025Oct.22];14(24S):943-50. Available from: https://mail.jneonatalsurg.com/index.php/jns/article/view/6089