Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques

Kumar Nayak, Subrat; Kumar Nayak, Ajit; Mishra, Smitaprava; Mohanty, Prithviraj; Tripathy, Nrusingha; Surjeet Chaudhury, Kumar

doi:10.32985/ijeces.16.1.6

International Journal of Electrical and Computer Engineering Systems, Vol. 16 No. 1, 2025.

Original scientific paper

https://doi.org/10.32985/ijeces.16.1.6

Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques

Subrat Kumar Nayak orcid.org/0000-0002-7438-9085 ; Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India *
Ajit Kumar Nayak orcid.org/0000-0003-2302-9458 ; Department of Computer Science and Information Technology, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
Smitaprava Mishra ; Department of Computer Science and Information Technology, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
Prithviraj Mohanty ; Department of Computer Science and Information Technology, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
Nrusingha Tripathy orcid.org/0000-0002-0272-7479 ; Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
Kumar Surjeet Chaudhury orcid.org/0009-0005-4607-3488 ; Department of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, India

* Corresponding author.

Full text: english pdf 2.055 Kb

page 53-64

downloads: 263

cite

APA 6th Edition

Kumar Nayak, S., Kumar Nayak, A., Mishra, S., Mohanty, P., Tripathy, N. & Surjeet Chaudhury, K. (2025). Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques. International journal of electrical and computer engineering systems, 16 (1), 53-64. https://doi.org/10.32985/ijeces.16.1.6

MLA 8th Edition

Kumar Nayak, Subrat, et al. "Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques." International journal of electrical and computer engineering systems, vol. 16, no. 1, 2025, pp. 53-64. https://doi.org/10.32985/ijeces.16.1.6. Accessed 23 Jul. 2026.

Chicago 17th Edition

Kumar Nayak, Subrat, Ajit Kumar Nayak, Smitaprava Mishra, Prithviraj Mohanty, Nrusingha Tripathy and Kumar Surjeet Chaudhury. "Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques." International journal of electrical and computer engineering systems 16, no. 1 (2025): 53-64. https://doi.org/10.32985/ijeces.16.1.6

Harvard

Kumar Nayak, S., et al. (2025). 'Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques', International journal of electrical and computer engineering systems, 16(1), pp. 53-64. https://doi.org/10.32985/ijeces.16.1.6

Vancouver

Kumar Nayak S, Kumar Nayak A, Mishra S, Mohanty P, Tripathy N, Surjeet Chaudhury K. Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques. International journal of electrical and computer engineering systems [Internet]. 2025 [cited 2026 July 23];16(1):53-64. https://doi.org/10.32985/ijeces.16.1.6

IEEE

S. Kumar Nayak, A. Kumar Nayak, S. Mishra, P. Mohanty, N. Tripathy and K. Surjeet Chaudhury, "Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques", International journal of electrical and computer engineering systems, vol.16, no. 1, pp. 53-64, 2025. [Online]. https://doi.org/10.32985/ijeces.16.1.6

Abstract

Emotion is fundamental to interpersonal interactions since it assists mutual understanding. Developing human-computer interactions and a related digital product depends heavily on emotion recognition. Due to the need for human-computer interaction applications, deep learning models for the voice recognition of emotions are an essential area of research. Most speech emotion recognition algorithms are only deployed in European and a few Asian languages. However, for a low-resource tribal language like KUI, the dataset is not available. So, we created the dataset and applied some augmentation techniques to increase the dataset size. Therefore, this study is based on speech emotion recognition using a low-resourced KUI speech dataset, and the results with and without augmentation of the dataset are compared. The dataset is created using a studio platform for better-quality speech data. They are labeled using six perceived emotions: ସଡାଙ୍ଗି (angry), େରହା (happy), ଆଜି (fear), ବିକାଲି (sad), ବିଜାରି (disgust), and େଡ଼କ୍‌(surprise). Mel-frequency cepstral coefficient (MFCC) is used for feature extraction. The deep learning technique is an alternative to the traditional methods to recognize speech emotion. This study uses a hybrid architecture of Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNNs) as classification techniques for recognition. The results have been compared with existing benchmark models, with the experiments demonstrating that the proposed hybrid model achieved an accuracy of 96% without augmentation and 97% with augmentation.

Keywords

KUI Dataset; Speech Emotion Recognition; Deep Learning; Long Short-Term Memory; Data Augmentation;

Hrčak ID:

326075

URI

https://hrcak.srce.hr/326075

Publication date:

2.1.2025.

Visits: 738 *

Login and registration

International Journal of Electrical and Computer Engineering Systems, Vol. 16 No. 1, 2025.

Abstract

Keywords

Hrčak ID:

URI

Publication date: