Multimodal Recognition of Socio-emotional Signals
The interpretation of facial expressions, head gestures, and prosodic information are important non-verbal cues for intelligent systems. The usage of these cues enables such a system to gain information about the mental state of the user and the quality of an interaction.
As a basis for the interpretation of a human face we use facial point extraction methods. By tracking these points over time we can calculate relevant features for the classification of the human's internal state  or facial communicative signals . In contrast to many other groups we do not want to recognize the seven basic emotions of Ekman but concentrate on socio-emotional signals like smiling, agreement, or confusion.
In addition to visual cues we analyse the prosody. We extract different features like pitch, energy and MFCCs from speech signals to detect socio-emotional signals and recognise user states such as hesitation and uncertainty. Other cues for the user state can be gained from filled-pause analysis. We use these prosodic features and combine them with the visual cues to get better classification results for the mental state of the user.
2012 | Journal Article | PUB-ID: 2604349Facial Communicative Signals - Valence Recognition in Task-Oriented Human-Robot InteractionPUB | DOI | WoS
Lang C, Wachsmuth S, Hanheide M, Wersing H (2012)
International Journal of Social Robotics - Special Issue on Measuring Human-Robot Interaction 4(3): 249-262.
Recent Best Paper/Poster Awards
Philippsen A, Reinhart F, Wrede B (2016)
International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob)
Richter V, Carlmeyer B, Lier F, Meyer zu Borgsen S, Kummert F, Wachsmuth S, Wrede B (2016)
International Conference on Human-agent Interaction (HAI)
Carlmeyer B, Schlangen D, Wrede B (2016)
International Conference on Human Agent Interaction (HAI)