Till KTH:s startsida Till KTH:s startsida

Logga in till din kurswebb

Du är inte inloggad på KTH så innehållet är inte anpassat efter dina val.

Ändra tidsperiod eller vy
Vecka 4 2014 Visa i Mitt schema
Tors 23 jan 09:00-10:00 Föreläsning
VT 2014
Föreläsning Lärare: Jens Edlund
Plats: Fantum

Course overview: Human-computer interaction and speech

Content: The main topics of the course are described. Introduction to human-computer speech interaction: Why use speech? What applications are there? What components are needed in a human-computer speech interface? Why is it so difficult for computers, when it is so easy for humans? Introduction to human speech production: Measurement methods. 
Literature: CST 1.1, 3.2--3.3, 4.1, 6.1--6.3, 7.1--7.2

Tors 23 jan 12:00-14:00 Lecture
VT 2014
Föreläsning Lärare: David House
Plats: Fantum

Speech production by humans 1: How is it done?

Content: The vocal tract, articulatory and acoustic phonetics, prosody, the phonetic alphabet 
Literature: CST Ch 1 

Tors 23 jan 14:00-15:00 Övning
VT 2014
Övning Lärare: David House
Plats: Fantum

Speech production by humans 2: The basics of phonetics
Content: Exercises on phonetic transcription, articulatory phonetics, the vowel space.

 

Vecka 5 2014 Visa i Mitt schema
Tors 30 jan 10:00-12:00 Föreläsning
VT 2014
Föreläsning Lärare: David House
Plats: Fantum

Speech recognition by humans
Content: The properties of human speech perception are described, e.g., critical bands, masking, categorical perception, perception of vowels and consonants, psycholinguistics. 
The lecture also describes the processing of speech in the brain, theories of language understanding and the structure of language. 
Literature: CST Ch 1.2.4, 1.3, 1.4 
Lieberman P & Blumstein S. "Speech physiology, speech perception, and acoustic phonetics". Cambridge University Press. Cambridge. 1988. pp. 145-149, 152-161

Tors 30 jan 13:00-15:00 Föreläsning
VT 2014
Föreläsning Lärare: Giampiero Salvi
Plats: Fantum

Speech recognition by computers 
Content: Issues in speech recognition, e.g. variability, microphones. Feature extraction, Methods: Neural networks, Template matching, Dynamic programming, Hidden Markov Models, Viterbi search. Evaluation: Error measures, types of errors, causes of errors, reducing errors.

Slides

Literature: CST Ch 3.1 -- 3.6 3.7 -- 3.14, 5.1, 8.1 

Tors 30 jan 15:00-17:00 Föreläsning
VT 2014
Föreläsning Lärare: Giampiero Salvi
Plats: Fantum

Speech recognition by computers (continued)
Content: Issues in speech recognition, e.g. variability, microphones. Feature extraction, Methods: Neural networks, Template matching, Dynamic programming, Hidden Markov Models, Viterbi search. Evaluation: Error measures, types of errors, causes of errors, reducing errors.

Tors 30 jan 17:00-18:00 Föreläsning
VT 2014
Föreläsning Lärare: Giampiero Salvi
Plats: Fantum

Introduction to Lab 1: Speech recognition
Building a speech recognizer in HTK.

Vecka 6 2014 Visa i Mitt schema
Tors 6 feb 10:00-12:00 Föreläsning
VT 2014
Föreläsning Lärare: Olov Engwall
Plats: Fantum

Speech synthesis
Content: The history of speech synthesis, parametric speech synthesis, formant synthesis, Concatenative speech synthesis: Diphone synthesis, unit selection of allophones, HMM synthesis
Text-to-speech synthesis, rules and dictionaries, morphological analysis, prosody, concept-to-speech, speaker variability: gender, dialects, emotions, evaluation of speech synthesis. 
Literature: CST Ch 4.1 -- 4.8

Lecture notes

Tors 6 feb 13:00-15:00 Föreläsning
VT 2014
Föreläsning Lärare: Jonas Beskow
Plats: Fantum

Visual speech synthesis & its applications

Content: Lip synchronization, visemes, talking heads, multimodal speech synthesis, direct mapping of sound to mouth shapes, non-verbal signals, virtual language teachers and lip-reading support. 
Literature: CST Ch 5.4, 6.4.5, 7.5.1--7.5.2, 8.3 and
Granström & House (2007). Modelling and evaluating verbal and non-verbal communication in talking animated interface agents.

Tors 6 feb 15:00-17:00 Föreläsning
VT 2014
Föreläsning Lärare: Jens Edlund
Plats: Fantum

Synthesis evaluation

Lecture slides

1 kommentar
Tors 6 feb 17:00-18:00 Föreläsning
VT 2014
Föreläsning Lärare: Joakim Gustafsson
Plats: Fantum

Introduction to lab 2: Speech synthesis

Lab 2 has two parts: one deals with synthesis creation and the other with evaluation.

In the first part, you will build a very small diphone synthesizer:

Lab2_part1 description

In the second part, you will evaluate two types of speech syntheses using two different methods:

(Lab2_part2 description pending) 

Vecka 7 2014 Visa i Mitt schema
Tors 13 feb 10:00-12:00 Föreläsning
VT 2014
Föreläsning Lärare: Joakim Gustafsson
Plats: Fantum

Spoken dialogue systems

Content: Components of and functionality of spoken dialogue systems. Issues such as initiative management, error handling, turn-taking, barge-in. 
Evaluating dialogue systems. Performance measures, Wizard-of-Oz, influencing user behaviour 
Literature: CST Ch 6.4.3, 6.4.4, 7.4, 7.5.5--7.5.6 + McTear (2002) 

Dialogue lecture notes

Tors 13 feb 13:00-14:00 Föreläsning
VT 2014
Föreläsning Lärare: Gabriel Skantze
Plats: Fantum

Introduction to Lab 3: Dialogue systems

Content: Introduction to VoiceXML and lab 3.

Instructions for the lab: http://www.speech.kth.se/~gabriel/voicexml/

Tors 13 feb 14:00-16:00 Lecture: data collection and experimentation
VT 2014
Föreläsning Lärare: Jens Edlund
Plats: Fantum
Tors 13 feb 16:00-18:00 Föreläsning
VT 2014
Föreläsning Lärare: Jens Edlund
Plats: Fantum

Home assignmensts and projects
Introduction to the project work and all home assignments.