Internal Projects

Internal Master Degree Projects at Speech, Music and Hearing (TMH)

The division of Speech, Music and Hearing hosts a number of Master students each year, who perform research-related projects during 6 months. The students get office space at the department. Below, we list topics for open project positions, both at KTH and at other places with which we collaborate. If you do not find a project that suits you, contact your favourite TMH faculty member to tailor a project. We welcome students' own ideas for their Bachelor's and Master's thesis projects, especially within the department's research areas.

Reinforcement learning of speech articulation

When children learn to speak, they do so through imitation, by comparing the sounds they can make through their own articulations to what they hear from their environment, and gradually refining articulatory strategies.

In speech science, articulatory models have been developed that can produce sounds corresponding to certain articulation (see https://dood.al/pinktrombone/ for a fun example). Controlling such models to make them output intelligible speech has however turned out to be a difficult problem.

The goal of the proposed project is to explore novel techniques in machine learning to learn control of an articulatory model. Specifically, reinforcement learning and imitation learning will be explored. In the generative adversarial imitation learning setting the learning agent interacts with the environment, where the true reward function is not given. Instead, the reward comes from a discriminator of a GAN network, which is learned alongside the policy network. Such techniques have been previously applied with great success to animation of human characters, where for example a system is trained to fulfill a target (e.g. kicking a ball) while at the same time exhibiting a motion behavior similar to example data from human recordings.

Suitable background of the student: machine learning and deep learning, experience with speech/audio/signal processing and reinforcement learning is a plus.

Contact: Jonas Beskow beskow@kth.se, Anna Deichler deichler@kth.se

Conversational Systems and Human-Robot Interaction

In Gabriel Skantze's research group, we have a number of projects related to Conversational Systems and Human-Robot Interaction where we welcome MSc Thesis students:

Modelling turn-taking in conversational systems
Social robots as hosts on self-driving buses
Social robots for language learning
Visual grounding of language in dialogue
Social robots as virtual patients
Understanding engagement in robot-human presentations

For more information about these projects, see this link.

Contact:
Gabriel Skantze (skantze@kth.se)

-------------------------------

Voice Science and Technical Vocology

Measuring, simulating or synthesizing the human voice has many applications, in medicine, pedagogy, media and the arts. There are a lot of unsolved problems in this area, and the voice is something that is close to us all.

Prerequisites:

- Some knowledge of analog and digital audio
- Good programming skills

More info at this link.

-------------------------------------------------------

Toward the next generation of collaborative embodied artificial intelligence

We are working towards creating an autonomous robot that can socially interact with other players and can competently play the tabletop game Pandemic. We will explore physical setups with robots but also virtual reality and augmented reality environments. This project requires cross-disciplinary collaboration with people that have different types of skills to come together. Therefore, we are opening multiple positions to work on this project in the areas of artificial intelligence, human-robot interaction, and extended reality research.

Project 1: From game action to embodied social dialog
Meta recently released their Diplomacy game-playing AI ‘Cicero’ which can generate social conversation informed by game actions. Similarly, in this project, you will create a conversation model based on game actions necessary for playing a different game (Pandemic). However, social dialog acts in humans are accompanied by embodied behaviors such as body gestures and facial expressions. In this project, you will not only tackle going from game actions to social dialog but also embed them with embodied information that allows an AI to be more competent in the physical world. Using the game pandemic as a research tool, you will focus on natural language machine learning techniques and perform user studies that compare embodied social agents with facial expressions to embodied agents with more neutral facial expressions.

Project 2: XR - Virtual and Augmented Reality Robots
In this project, you will be comparing a physical environment where people play pandemic with a physical robot against different eXtended Reality (XR) setups that replace the robot with a VR and/or AR version of it. The challenge in this task will be to translate the physical experience to digital while maximizing immersion between humans and the virtual/physical robot. In the end, you will conduct a human/robot interaction experiment that compares enjoyment in playing tabletop games with AR/VR agents and physical robots.

Project X: Discuss additional projects with us on the same topic
Feel free to contact us if you want to work on a different variation of the projects above or have a different idea on the same or similar topic.

Contact: André Pereira (atap@kth.se) and Jura Miniotaite (jura@kth.se).

--------------------------------------------------------

Social Robotics

Several proposals are currently available, see the page of Ronald Cumbal.

--------------------------------------------------------

Generative Machine Learning

Several projects are available to strong students interested in generative deep learning, especially with applications to audio, 3D animation, images, and VR.

Please see Gustav Eje Henter's thesis project suggestions for more information (only visible if logged in).

--------------------------------------------------------

Human-Robot Interaction:

Topic:
With contingent interpersonal interactions, we create a neural sense of grounding when the quality, intensity, and timing of others’ signals clearly reflect the signals that we have sent. In HRI, we operationalise contingency as a correlation between robot behaviour and changes in its environment.

Given a set of social actions, it is important for a robot to know what is appropriate to do while in dialogue with humans. In this master thesis project, you will investigate quantitative and qualitative indicators, to assess human reactions in human-robot dialogue. You will design the interaction and a task-oriented dialogue and explore objective and subjective measures from human users. Further, you will experiment with sensor data and build a machine learning classifier to interpret what features from human users contribute to understanding of robot actions.

You will experiment with open-source platforms such as OpenFace and OpenSmile and one of our robotic platforms (Furhat or Nao) to build an application that combines multimodal signals and generates appropriate robot responses.

Required skills:
- Knowledge in human-computer interaction
- Good programming skills in Python
- Knowledge in Machine Learning is a plus (equivalent to KTH machine learning course)

Contact:

Contact Dimos Kontogiorgos (diko@kth.se) or Joakim Gustafson (jocke@speech.kth.se).