Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.1080/00051144.2024.2371249

Data augmentation using a 1D-CNN model with MFCC/MFMC features for speech emotion recognition

Thomas Mary Little Flower ; Department of ECE, St.Xavier’s Catholic College of Engineering, Chunkankadai, India *
Thirasama Jaya ; Department of ECE, Saveetha College of Engineering, Chennai, India
Sreedharan Christopher Ezhil Singh ; Department of Mechanical Engineering, Vimal Jyothi Engineering College, Kannur, India

* Dopisni autor.


Puni tekst: engleski pdf 4.346 Kb

str. 1325-1338

preuzimanja: 0

citiraj


Sažetak

Speech emotion recognition (SER) is attractive in several domains, such as automated translation,
call centres, intelligent healthcare, and human–computer interaction. Deep learning models for
emotion identification need considerable labelled data, which is only sometimes available in the
SER industry. A database needs enough speech samples, good features, and a better classifier to
identify emotions efficiently. This study uses data augmentation to enhance the amount of input
voice samples and address the data shortage issue. The database capacity increases by adding
white noise to the speech signals by data augmentation. In this work, the Mel-frequency Cepstral
Coefficient (MFCC) and Mel-frequency Magnitude Coefficient (MFMC) features, along with a onedimensional convolutional neural network (1D-CNN), are used to classify speech emotions. The
datasets utilized to estimate the model’s enactment were AESDD, CAFE, EmoDB, IEMOCAP, and
MESD. The data augmentation with the 1D-CNN (MFMC) model performed best, with an average
accuracy of 99.2% for AESDD, 99.5% for CAFE, 97.5% for EmoDB, 92.4% for IEMOCAP and 96.9%
for the MESD database. The proposed 1D-CNN (MFMC) with data augmentation outperforms the
1D-CNN (MFCC) without data augmentation in emotion recognition.

Ključne riječi

Neural networks; affective computing; emotion recognition; audio database; accuracy

Hrčak ID:

326329

URI

https://hrcak.srce.hr/326329

Datum izdavanja:

3.7.2024.

Posjeta: 0 *