Please use this identifier to cite or link to this item:
http://localhost:8081/jspui/handle/123456789/19726| Title: | MULTIMODAL EMOTION ANALYSIS USING DEEP LEARNING TECHNIQUES |
| Authors: | Kumar, Puneet |
| Issue Date: | Oct-2022 |
| Publisher: | IIT Roorkee |
| Abstract: | The need to develop computational systems that can recognize the emotions por trayed in various modalities such as speech, text, and image is rapidly increasing. The experience of emotion, feeling, cognition, and behavioral processes is known as ‘Affect.’ Three fundamental methods are used in Affective Computing to analyze various affects: self-feedback-based analysis, behavior observation, and physiological studies. Affect analysis approaches are further divided into two types. The first is the intangible (directly observable) approach using computer vision, natural language processing, and speech processing techniques, and the second is the tangible (not directly observable but perceptible by touch) approach using sensors and other tools of physiological monitor ing. This thesis analyses intangibly expressed emotions through behavior observation, predominantly associated with speech, text, and image modalities. This thesis starts by introducing emotion analysis, its representations, modalities, applications, and the need for multimodal emotion analysis. Further, it surveys the research on multimodal emotion analysis, affective response generation, explainability, and interpretability. Moreover, the techniques for deep neural networks’ explainability and hyperparameter tuning have been proposed and used in the subsequent chapters. In the direction of end-to-end emotion recognition, it develops a speech-emotion recognition system using deep neural networks, residual learning, and triplet loss. The emotion-related information is learned from a labeled emotional speech dataset as embeddings and used for emotion recognition. Further, it proposes a novel text emotion recognition system and develops a cross-lingual translation-based method for Sanskrit text sentiment analysis. A deep-learning-based facial emotion recognition system has been proposed, which is further adapted to perform image emotion recognition using domain adaptation. The main and adapted models are trained simultaneously using the discrepancy loss, which enables the models to learn the distribution of IER datasets along with FER datasets’ distribution. To combine complementary information from multiple modalities, the insights from unimodal emotion recognition are used for multimodal emotion recognition. Further, a novel interpretable multimodal emotion recognition system has been proposed to classify an input containing the image, speech, and text modalities into discrete emotion classes. The proposed system reports the importance of each modality and its features leading to the classification of a particular emotion class. Four emotion classes, i.e., ‘happy,’ ‘sad,’ ‘hate,’ and ‘anger’ have been considered for emotion analysis because these are the common classes in various existing methods and datasets for unimodal and multimodal emotion analysis. Besides recognizing the affects portrayed by multimodal data, emotion analysis also aims to generate the affect according to the user’s emotional state. In that context, a novel task has been defined to synthesize contextually relevant feedback as a new modality from two given modalities, i.e., image and text. We have proposed a novel affective feedback synthesis system and compiled a new dataset containing images, text, Twitter user comments, and the number of likes (upvotes) for each comment. Finally, the conclusions of the thesis are presented along with the future scope for multimodal emotion analysis and affective content synthesis. Keywords: Affective computing, Emotion analysis, Deep learning, Speech emotion recognition, Text emotion recognition, Text sentiment analysis, Multimodal informa tion fusion, Facial emotion recognition, Image emotion recognition, Explainability, Interpretability, Feedback synthesis, Dataset construction, Hyperparameter tuning. |
| URI: | http://localhost:8081/jspui/handle/123456789/19726 |
| Research Supervisor/ Guide: | Raman, Balasubramanian |
| metadata.dc.type: | Thesis |
| Appears in Collections: | DOCTORAL THESES (CSE) |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| PUNEET KUMAR 18911007.pdf | 29.89 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
