Show simple item record

dc.contributor.advisorDíaz Cabrera, Gloría Mercedes
dc.contributor.advisorCaicedo Rueda, Juan Carlos
dc.contributor.authorRubén Darío, Fonnegra Tarazona
dc.format.mediumRecurso electrónicospa
dc.publisherInstituto Tecnológico Metropolitanospa
dc.subjectProcesamiento de señalesspa
dc.subjectSistemas hombre-máquinaspa
dc.subjectAnálisis espectralspa
dc.subjectSistemas de reconocimiento de configuracionesspa
dc.subjectRedes neuronales (Computadores)spa
dc.titleAutomatic Emotion Recognition From Multimodal Information Fusion Using Deep Learning Approachesspa
dc.publisher.facultyFacultad de ingenieríasspa
dc.publisher.programMaestría en Automatización y Control Industrialspa
dc.subject.keywordsSpectrum analysiseng
dc.subject.keywordsPattern recognition systemseng
dc.subject.keywordsArtificial intelligenceeng
dc.subject.lembANÁLISIS ESPECTRALspa
dc.description.abstractenglishDuring recent years, the advances in computational and information systems have contributed to the growth of research areas, including a ective computing, which aims to identify the emotional states of humans to develop di erent interaction and computational systems. For doing so, emotions have been characterized by speci c kind of data including audio, facial expressions, physiological signals, among others. However, the natural response of data to a single emotional event suggests a correlation in different modalities when it achieves a maximum peak of expression. This fact could lead the thinking that the processing of multiple data modalities (multimodal information fusion) could provide more learning patterns to perform emotion recognition. On the other hand, Deep Learning strategies have gained interest in the research community from 2012, since they are adaptive models which have shown promising results in the analysis of many kinds of data, such as images, signals, and other temporal data. This work aims to determine if information fusion using Deep Neural Network architectures improves the recognition of emotions in comparison with the use of unimodal models. Thus, a new information fusion model based on Deep Neural Network architectures is proposed to recognize the emotional states from audio-visual information. The proposal takes advantage of the adaptiveness of the Deep Learning models to extract deep features according to the input data type. The proposed approach was developed in three stages. In a rst stage, characterization and preprocessing algorithms (INTERSPEECH 2010 Paralinguistic challenge features in audio and Viola Jones face detection in video) were used for dimensionality reduction and detection of the main information from raw data. Then, two models based on unimodal analysis were developed for processing audio and video separately. These models were used for developing two information fusion strategies: a decision fusion and a characteristic fusion model, respectively. All models were evaluated using the eNTERFACE database, a well-known public audiovisual emotional dataset, which allows compare results with state of the art methods. Experimental results showed that Deep Learning approaches that fused the audio and visual information outperform the unimodal strategies.eng
dc.description.degreenameMagister en Automatización y Controlspa
dc.identifier.instnameinstname:Instituto Tecnológico Metropolitanospa
dc.identifier.reponamereponame:Repositorio Institucional Instituto Tecnológico Metropolitanospa
dc.rights.localAcceso abiertospa
dc.rights.creativecommonsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.relation.citationissueRevista CEA
dc.type.localTesis/Trabajo de grado - Monografía - Maestríaspa

Files in this item


This item appears in the following Collection(s)

Show simple item record
Except where otherwise noted, this item's license is described as