Skip to the main content

Original scientific paper

https://doi.org/10.32985/ijeces.16.3.3

A Deep Learning Framework with Optimizations for Facial Expression and Emotion Recognition from Videos

Ranjit Kumar Nukathati ; Department of Computer Science and Engineering, JNTUA, Anantapur, Andhra Pradesh, India *
Uday Bhaskar Nagella ; Government College (A), Anantapur, Andhra Pradesh, India
AP Siva Kumar ; Department of Computer Science and Engineering, JNTUA, Anantapur, Andhra Pradesh, India

* Corresponding author.


Full text: english pdf 1.937 Kb

page 217-229

downloads: 240

cite


Abstract

Human emotion recognition has many real-time applications in healthcare and psychology domains. Due to the widespread usage of smartphones, large volumes of video content are being produced. A video can have both audio and video frames in the form of images. With the advancements in Artificial Intelligence (AI), there has been significant improvement in the development of computer vision applications.Accuracy in recognizing human emotions from given audio-visual content is a very challenging problem. However, with the improvements in deep learning techniques,analyzing audio-visual content towards emotion recognition is possible. The existing deep learning methods focused on audio content or video frames for emotion recognition. An integrated approach consisting of audio and video frames in a single framework is needed to leverage efficiency. This paper proposes a deep learning framework with specific optimizations for facial expression and emotion recognition from videos. We proposed an algorithm, Learning Human Emotion Recognition (LbHER), which exploits hybrid deep learning models that could process audio and video frames toward emotion recognition. Our empirical study with a benchmark dataset, IEMOCAP, has revealed that the proposed framework and the underlying algorithm could leverage state-of-the-art human emotion recognition. Our experimental results showed that the proposed algorithm outperformed many existing models with the highest average accuracy of 94.66%. Our framework can be integrated into existing computer vision applications to recognize emotions from videos automatically.

Keywords

Emotion Recognition; Spatial Expression Analysis; Deep Learning; Artificial Intelligence; Hyperparameter Tuning;

Hrčak ID:

329277

URI

https://hrcak.srce.hr/329277

Publication date:

17.3.2025.

Visits: 428 *