Area 2: Human Communication and Behavior
In this area we develop models of how humans perceive and produce non-verbal communication. This can be used both to gain understanding about the mechanisms underlying human communication and behavior, and also to design systems where communication and behavior understanding is used, e.g. for computerized analysis of cognitive decline.
People
Research Engineers
MSc Students
- NN
- Magnus Ruben Tibbe (MSc 2024)
- Fanxuan Liu (MSc 2024)
- Ioannis Athanasiadis (MSc 2022)
- Frans Nordén (MSc 2021)
- Olga Mikheeva (MSc 2017)
PhD Students
- Yifan Lu
- Chen Ling
- Olga Mikheeva (2017-2022, now at King, Sweden)
- Taras Kucherenko (PhD 2021, now at Electronic Arts, Sweden)
- Judith Bütepage (co-supervisor, PhD 2019, now at Electronic Arts, Sweden)
- Kalin Stefanov (co-supervisor, PhD 2018, now at Monash University, Australia)
Post Docs
- Henglin Shi
- Ruibo Tu (2023-2024, now at Qlik, Sweden)
- Yanxia Zhang (2016, now at Toyota Research Institute, USA)
Collaborators
- Jonas Beskow
- Gustav Eje Henter
- Joakim Gustafson
- Johanna Björklund (Umeå University, Sweden)
- Judith Bütepage (Electronic Arts, Sweden)
- Johan Lundström (Karolinska Institutet, Sweden)
- Gustaf Mårtensson (Mycronic and Karolinska Institutet, Sweden)
- Ulrika Ådén (Karolinska Institutet, Sweden)
Current Projects
Evaluation of generative models (WASP 2024-present)
This project is part of the WARA Media and Language and a collaboration with the company Electronic Arts. Generative models will revolutionize many industries and professions, with applications like programming assistants already in use. This raises a need for reliable and automated metrics that measure, for example, method robustness and appropriateness. Understanding quality is particularly crucial in domains less intuitive to the average user than images and text, which might require expert evaluation of each generated sample. Currently, only a few automated metrics exist, and their correlation with human judgment is debatable. Publications |
The relation between motion and cognition in infants (SeRC 2023-present)
In this project, which is part of the SeRC Data Science MCP and a collaboration with the Department of Women’s and Children’s health at Karolinska Institutet, we study the relation between motion patterns and cognition and brain function in infants . The currently primary application is detection of motor conditions in neonates, but we will also study more general connections between motion and future development of cognition and language. Publications |
UNCOCO: UNCOnscious COmmunication (WASP 2023-present)
This project, which is part of the WARA Media and Language and a collaboration with the Perceptual Neuroscience group at KI, entails two contributions. Firstly, we develop a 3D embodied, integrated representation of head pose, gaze and facial micro expression, that can be extracted from a regular 60 Hz video camera and a desk-mounted gaze sensor. The embodied, integrated 3D representation of head pose, gaze and facial micro expression provides a preprocessing step to the second contribution, a deep generative model for inferring the latent emotional state of the human from the non-verbal communicative behavior. The model is employed in three different contexts: 1) estimating user affect for a digital avatar, 2) analyzing human non-verbal behavior connected to sensor stimuli, e.g., quantify approach/avoidance motor response to smell, 3) estimating frustration in a driving scenario. Publications |
STING: Synthesis and analysis with Transducers and Invertible Neural Generators (WASP 2022-present)
Human communication is multimodal in nature, and occurs through combinations of speech, The STING NEST, part of the WARA Media and Language, intends to change this state of affairs by uniting synthesis and analysis with transducers and invertible neural models. This involves connecting concrete, continuous valued sensory data such as images, sound, and motion, with high level, predominantly discrete, representations of meaning, which has the potential to endow synthesis output with human understandable highlevel explanations, while simultaneously improving the ability to attach probabilities to semantic representations. The bidirectionality also allows us to create efficient mechanisms for explainability, and to inspect and enforce fairness in the models. Publications
|
Past Projects
EACare: Embodied Agent to support elderly mental wellbeing (SSF, 2016-2021)
The main goal of the multidisciplinary project EACare is to develop an embodied agent – a robot head with communicative skills – capable of interacting with especially elderly people at a clinic or in their home, analyzing their mental and psychological status via powerful audiovisual sensing and assessing their mental abilities to identify subjects in high risk or possibly at the first stages of cognitive decline, with a special focus on Alzheimer’s disease. The interaction is performed according to the procedures developed for memory evaluation sessions, the key part of the diagnostic process for detecting cognitive decline. Publications
|
Data-driven modelling of interaction skills for social robots (KTH ICT-TNG 2016-2018)
This project aims to investigate fundamentals of situated and collaborative multi-party interaction and collect the data and knowledge required to build social robots that are able to handle collaborative attention and co-present interaction. In the project we will employ state-of-the art motion- and gaze tracking on a large scale as the basis for modelling and implementing critical non-verbal behaviours such as joint attention, mutual gaze and backchannels in situated human-robot collaborative interaction, in a fluent, adaptive and context sensitive way. Publications
|
HumanAct: Visual and multi-modal learning of Human Activity and interaction with the surrounding scene (VR, EIT ICT Labs 2010-2013)
The overwhelming majority of human activities are interactive in the sense that they relate to the world around the human (in Computer Vision called the "scene"). Despite this, visual analyses of human activity very rarely take scene context into account. The objective in this project is modeling of human activity with object and scene context. The methods developed within the project will be applied to the task of Learning from Demonstration, where a (household) robot learns how to perform a task (e.g. preparing a dish) by watching a human perform the same task. Publications
|