DT2119 Speech and Speaker Recognition 7.5 credits

The course objective is to provide a systematic introduction to speech processing and recognition. Models of speech production and speech analysis will form a basis to understanding the problem of speech recognition. Probabilistic machine learning methods will be employed for the recognition task, including Hidden Markov Models, Gaussian Mixture Models, Support Vector Machines, Deep Neural Networks.

Information per course offering

Choose semester and course offering to see current information and more about the course, such as course syllabus, study period, and application information.

Termin

Spring 2026Spring 2027

Information for Spring 2027 Start 15 Mar 2027 programme students

Course location: KTH Campus
Duration: 15 Mar 2027 - 31 May 2027
Periods: Spring 2027: P4 (7.5 hp)
Pace of study: 50%
Application code: 11805
Form of study: Normal Daytime
Language of instruction: English
Course memo: Course memo is not published
Number of places: Min: 1
Target group: Searchable for students from year 3 and for students admitted to a master programme as long as it can be included in your program.
Planned modular schedule: [object Object]
Schedule: Schedule is not published
Part of programme: Master's Programme, Computer Science, year 1, CSDA
Master's Programme, Computer Science, year 1, CSCS
Master's Programme, Systems, Control and Robotics, year 1, RASM
Master's Programme, Industrial Engineering and Management, year 1, MAIG
Master's Programme, Interactive Media Technology, year 1
Master's Programme, Machine Learning, year 1
Master's Programme, Interactive Media Technology, year 2

Contact

Examiner

No information inserted

Course coordinator

No information inserted

Teachers

No information inserted

Course syllabus as PDF

Please note: all information from the Course syllabus is available on this page in an accessible format.

Course syllabus DT2119 (Spring 2020–)

Headings with content from the Course syllabus DT2119 (Spring 2020–) are denoted with an asterisk ( )

Content and learning outcomes

Course contents

The course consists of lectures, three laboratory sessions with hand-in assignments, as well as writing an essay on a subject chosen in consultation with the teacher. The thesis is furthermore presented orally during a final seminar. The laboratory sessions consist of designing different parts of a speech recognition application, training the system and evaluating its performance.

The following theoretical course components are included:

algorithms for training, recognition as well as adaptation to properties of speakers and transmissions channel, including pattern recognition, Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs)
methods to decrease the sensitivity to disturbances and deviations
probability theory
signal processing and parameter extraction
acoustic modelling of the static and dynamic spectral properties of speech sounds
statistical modelling of language in spontaneous and formal speech
search strategies - basic methods and strategies for large vocabularies
specific methods for analysis and decision making, for recognition of speakers.

Furthermore, some practical insights into building an application are given. This includes the implementation of certain functions based on prototypes, and testing them on real speech data.

Intended learning outcomes

Having passed the course, the student shall be able to

implement methods for training and evaluation of speech recognition systems
train and evaluate a speech recognizer, using software tools
compare different methods for feature extraction and training
document and discuss specific aspects related to recognition of speech and of speakers
review and criticise other students' work in the subject, based on the literature.

Literature and preparations

Specific prerequisites

No information inserted

Recommended prerequisites

Some knowledge of Machine learning, possibly DD2421, DD2434 or EN2202

Some programming knowledge, best if Python

Some knowledge in Signal Processing

Literature

You can find information about course literature either in the course memo for the course offering or in the course room in Canvas.

Examination and completion

Grading scale

A, B, C, D, E, FX, F

Examination

PRO1 - Project, 3.0 credits, grading scale: A, B, C, D, E, FX, F
LAB1 - Computer Lab, 4.5 credits, grading scale: P, F

Based on recommendation from KTH’s coordinator for disabilities, the examiner will decide how to adapt an examination for students with documented disability.

The examiner may apply another examination format when re-examining individual students.

If the course is discontinued, students may request to be examined during the following two academic years.

Other requirements for final grade

Laboratory exercises
Written assignments.
Academic paper and its presentation at a final review
Assessment of two other course participants' theses, and critical review of their presentations.

Examiner

Jonas Beskow

Ethical approach

All members of a group are responsible for the group's work.
In any assessment, every student shall honestly disclose any help received and sources used.
In an oral assessment, every student shall be able to present and answer questions about the entire assignment and solution.

Further information

Course room in Canvas

Registered students find further information about the implementation of the course in the course room in Canvas. A link to the course room can be found under the tab Studies in the Personal menu at the start of the course.

Offered by

EECS/Speech, Music and Hearing

Main field of study

Computer Science and Engineering

Education cycle

Second cycle

Supplementary information

The course may be canceled or be given in another form if the number of regular registrations are too few.

In this course, the EECS code of honor applies, see:
http://www.kth.se/en/eecs/utbildning/hederskodex

Studies

Support and guidance

IT and digital services

Contact

DT2119 Speech and Speaker Recognition 7.5 credits

Information per course offering

Information for Spring 2027 Start 15 Mar 2027 programme students

Contact

Course syllabus as PDF

Content and learning outcomes

Course contents

Intended learning outcomes

Literature and preparations

Specific prerequisites

Recommended prerequisites

Literature

Examination and completion

Grading scale

Examination

Other requirements for final grade

Examiner

Ethical approach

Further information

Course room in Canvas

Offered by

Main field of study

Education cycle

Supplementary information