The course consists of six modules, where each module covers a different research area in ML systems. Each module has two sessions: one lecture session and one discussion session. In the lecture session of each module, the teacher will introduce the context and give an overview of the reading material for the week. The students, then have a week to study the topic and go through the reading material. They are also required to submit a detailed review of the selected papers. Then, during the discussion session of each module, they review and discuss the topic and the papers in depth. The goal of this format is to both build a mastery of the material and also to develop a deeper understanding of how to evaluate and review research and hopefully provide insight into how to write better papers, identify open research questions and need for further research.
FID3024 Systems for Scalable Machine Learning 7.5 credits

In the last few years we have been witnessing advances in hardware and software systems that enabled us to train complex Machine Learning (ML) models on massive datasets. To name a few of these hardware and software systems, we can refer to new generation of GPUs, as well as open source frameworks such as Apache Spark, TensorFlow and Ray. Moreover, advances in parallelization, job scheduling, and robustness have empowered us to build complex ML models more efficiently and at scale. In this course we will provide a comprehensive survey of the latest trends in ML systems designs and present different techniques to build such systems. The course covers the main components of ML systems, starting from fundamental concepts of ML to more advanced topics such as parallelization and robustness in designing ML systems. Participants in the course will be required to reflect on the arrangement of different techniques, rules, and guidelines to build ML systems and suggest possible extensions to the technology from their own research domains.
Information per course offering
Course offerings are missing for current or upcoming semesters.
Course syllabus as PDF
Please note: all information from the Course syllabus is available on this page in an accessible format.
Course syllabus FID3024 (Autumn 2020–)Content and learning outcomes
Course disposition
Course contents
The course covers the following topics in the same order
- Fundamental ML, e.g., generalization, back-propagation, etc.
- Parallelization, e.g., data-parallel, model-parallel
- AutoML, e.g., hyperparameter optimization, meta learning, and Neural Architecture Search (NAS)
- Scheduling and optimization, e.g., model compression, gradient compression, etc.
- Robust learning, e.g., byzantine-resilient learning
- ML platforms, e.g., TensorFlow, Ray, Mllib
Intended learning outcomes
After passing the course, students should be able to:
- Demonstrate systematic understanding of ML systems and capacity to scholarly analyze and criticize their components.
- Reflect on the ideas and technologies related to ML systems with insight on their possibilities and limitations.
- Examine how ML systems are currently used and evaluate how they can be used for new purposes and under different application domains.
- Identify the need for further knowledge in improving ML systems.
Literature and preparations
Specific prerequisites
Enrolled as a doctoral student.
Recommended prerequisites
The target students of the course are mainly PhD students of the computer science, information and communication technology, and electrical engineering doctoral programmes, as well as all other PhD students who are interested to know the architecture and fundamentals of modern ML systems. The students should be familiar with the basics of ML, distributed systems, and have a good programming knowledge especially in Python or Scala.
Literature
Examination and completion
If the course is discontinued, students may request to be examined during the following two academic years.
Grading scale
Examination
- EXA1 - Examination, 7.5 credits, grading scale: P, F
Based on recommendation from KTH’s coordinator for disabilities, the examiner will decide how to adapt an examination for students with documented disability.
The examiner may apply another examination format when re-examining individual students.
Other requirements for final grade
The course will be assessed with a Pass/Fail grade, based on active participation in the discussion meetings, as well as a scientifically sound review report in each week. In addition to this, a passing student must attend at least 75% of all lectures and 75% of all student presentation sessions.
Examiner
Ethical approach
- All members of a group are responsible for the group's work.
- In any assessment, every student shall honestly disclose any help received and sources used.
- In an oral assessment, every student shall be able to present and answer questions about the entire assignment and solution.