Till KTH:s startsida Till KTH:s startsida

Visa version

Version skapad av Anastasiia Varava 2019-11-27 11:06

Visa < föregående | nästa >
Jämför < föregående | nästa >

Master thesis proposals - external

H&M

These projects will be performed in collaboration with H&M and aim to provide efficient solutions for some of the problems that the company is facing. The successful applicant will have an opportunity to closely interact with experts from H&M, have access to the real-world data owned by the company, and receive guidance from KTH researchers.

Project 1: Quality prediction

Every day, H&M orders, transports, and sells thousands of items. Quality control is an important part of this process. When boxes with goods are delivered to one of the warehouses, random quality checks are performed. There are multiple factors that can potentially affect the quality of the products in a specific box: they come different suppliers overseas, are transported in different ways (by ships, trains, or planes), and indirect factors such as weather also may play a certain role.

In this project, we aim to develop a probabilistic model that will predict the quality of the products based on the information about the supplier, the means of transportation, etc. To train the model, H&M will provide the relevant data. If implemented successfully, the system could later be used to identify the boxes that are more likely to require quality control.

Required qualifications: proficiency in Machine Leaning and Data Science. Applicants are expected to have passed KTH courses such as Advanced Machine Learning, Project Course in Data Science, Probabilistic Graphical Models, or equivalent.

Contact person: Anastasiia Varava (KTH) varava@kth.se

Project 2: Carton fillrate optimization

To transport the products, H&M needs to make decisions on how to package them. Depending on the characteristics of the products being packed (type of product, size, material, fragility, etc), different types of cartons can be used, and different number of items are packed into each of them.

Previously, this decisions were made by humans on a case-to-case basis.

In this project, we aim to automate this process. Based on the data collected earlier at H&M (item descriptions and packaging specifications), we aim to develop a model that will find the best way to package novel products. The model will need to capture specific features of the products that can potentially affect these decisions, and thus can be trained on existing data that will be provided by the company.

Required qualifications: proficiency in Machine Leaning and Data Science. Applicants are expected to have passed KTH courses such as Advanced Machine Learning, Project Course in Data Science, Probabilistic Graphical Models, or equivalent.

Contact person: Anastasiia Varava (KTH) varava@kth.se

Project 3: Inbound leadtime prediction

H&M orders large amounts of goods from multiple different suppliers in various countries to markets all over the world. Each product is expected to be delivered by a specific date. However, delays in production and transportation are inevitable, especially when the products are transported by ship due to multiple factors such as weather conditions, scheduling imprecision, multiple connections between different routes, etc. The goal of this project is to develop a probabilistic model that will be trained on the historical data provided by the company to predict the expected time needed for the goods to be manufactured and delivered to markets.

Required qualifications: proficiency in Machine Leaning and Data Science. Applicants are expected to have passed KTH courses such as Advanced Machine Learning, Project Course in Data Science, Probabilistic Graphical Models, or equivalent.

Contact person: Anastasiia Varava (KTH) varava@kth.se

Volumental

Volumental (https://www.volumental.com/) is a spin-off from RPL, founded in 2012 by phd students Alper Aydemir, Rasmus Göranssson and Miroslav Kobetski. Up until now the business has been focused on 3D scanners for feet and how to use the data from such scans to help customers find shoes that fit them. More than 1500 scanners have been sold world-wide, and are now a central piece of equipment in stores by, for example, New Balance, Fleet Feet, Wintersteiger and Bauer. One of next step in the (r)evolution that is transforming the retail is to extend the help that the customer gets in to the fitting process. This can be done in several ways and we are looking for people that want to be part of that journey.

Some example questions: How to provide an accurate scan on a mobile phone, using deep learning and computer vision techniques? How to build a machine learning based recommendation system that is scalable to multiple millions of requests across the globe? How to design better products with the 3D data? How to identify special cases that could lead to failure and this less trust in the system?

We are looking for you who are most likely a master student from systems, control and robotics, machine learning or computer science and looking for a thesis project in the real world but with people that have a strong academic background and connection to give good supervision for the work and a good understanding for what a thesis project is about. You have experience in programming so that this is not an obstacle for the work. Depending on what you want to do you also bring experience in that area. If you want to work with 3D modelling of feet you have probably studied estimation worked with sensors. If you want to work with advanced recommendation systems you know your way around machine learning and the associated tools.

Contact: Alper AydemirCo-founder and CTO of Volumental

Scania

Reinforcement Learning at Flexible Alternating Current Transmission Systems

You will be part of the Business Unit Grid Integration, located in Västerås. FACTS (Flexible Alternating Current Transmission Systems) technologies provide more power and control in existing AC as well as green-field networks and have minimal environmental impact. With a complete portfolio and in-house manufacturing of key components, ABB is a reliable partner in shaping the grid of the future. Please find out more about our world leading technology at www.abb.com/facts.

Problem description

At FACTS you will aid in creating a more sustainable future. The increase of available measurements, growth of real time processing capacity and communication abilities is changing the opportunities in the power system landscape. There is a potential to utilize more measurement data, Reinforcement Learning and a FACTS device for optimization purposes.

The task would include: 
• Create suitable test networks
• Investigate and develop prototype algorithm based on latest advancements in deep Reinforcement Learning
• Train, tune and test the algorithms on the networks and demonstrate policy optimality

Requirements

We are looking for you who are studying a university master program within a relevant technical area along with an interest within artificial intelligence or Reinforcement Learning. Experience in Reinforcement Learning or programming is positive. We aim to start the thesis in early September.

Model-based Model-free Reinforcement Learning for Robotics Manipulation

Background
Recent advances in artificial intelligence has enabled machines to compete with humans even in the most difficult of domains. Google Deepmind's AlphaGo is a case in point. Similar approaches of reinforcement learning (RL) have been tried in the robotics community on problems of skill learning. By skill we mean a sensorimotor policy (control policy) that can perform a single continuous-time task. Numerous successes in skill learning have been reported for a variety of manipulation tasks that are otherwise difficult to program. Examples include, batting, pancake flipping, pouring, pole balancing etc. One of the most challenging class of manipulation tasks is assembly of mating parts. Not surprisingly, the capability to learn assembly skills is highly sought after. 

Problem description
RL can be divided into model-based and model-free methods. In model-based methods, the algorithm learns a dynamics model of the manipulation task and utilizes it to optimize the policy. Contrary to this, in model-free RL (policy search), the policy is often directly optimized without the intermediate step of model learning. The trade off here is between number of trials (sample efficiency) and model bias. While mode-based methods are sample efficient, model-free methods do not suffer from model bias. We propose a hybrid approach that has benefits of both methods in it. It employs a global black box optimization method called Bayesian optimization (BO) to learn the policy in a fundamentally model-free way, but at the same time uses a learned model to guide the process. We will exploit the fact that BO does not require a cost function for the learning process. Our application will be an assembly task in which an ABB YuMi robot will insert one part into another part.

Purpose and aims
The objective of this thesis is to develop a skill learning method under the framework of RL. The robot should be able to demonstrate the learning process by continuously trying to do the insertion while making incremental progress and finally achieve convergence by being able to complete the task successfully in a few consecutive trials.

The work will include the following tasks:

  1. Conduct literature review on RL based skill learning and BO.
  2. Formulate a strategy for utilizing a learned dynamics model for guiding the BO. Model learning algorithm can be assumed to be given.
  3. Set up either MuJoCo or Bullet simulation environment. Implement a simpler task of inverted pendulum and then the main insertion task.
  4. Develop a parameterized policy (not necessarily deep network) and implement the BO based RL algorithm including the results of Step 2.
  5. Evaluate the method on a real robot and draw conclusions about the hybrid method. 

We are searching for a highly motivated student from master programs such as Systems, Control and Robotics, or Machine learning, or a student with a similar background. Knowledge in modeling and control of robotics manipulator is highly advantageous. Any prior exposure to Gaussian process regression, RL or BO will be valued. A medium to high level of competency in either Python or Matlab is necessary. Masters level knowledge of linear algebra and probability theory is expected and general competence in machine learning will be highly appreciated.

The master student will gain competences within Robotics, Robot Control, Reinforcement learning, Bayesian optimization, Gaussian process, etc. Note that the student will work in ABB Corporate Research in Västerås and compensation plus accommodation will be provided by the company. This project is defined within the context of an ongoing PhD project and therefore, the student can expect a high level of research environment and support, including software and systems. Prospective PhD student will be given preference. It may also be possible to do this project at RPL but the decision will be taken on a case by case basis.

Contact:  Shahbaz Khader, +46725305968, shahbaz.khader@se.abb.com, ABB 

Online Planning Based Reinforcement Learning for Robotics Manipulation 

Background
Recent advances in artificial intelligence has enabled machines to compete with humans even in the most difficult of domains. Google Deepmind's AlphaGo is a case in point. Similar approaches of reinforcement learning (RL) have been tried in the robotics community on problems of skill learning. By skill we mean a sensorimotor policy (control policy) that can perform a single continuous-time task. Numerous successes in skill learning have been reported for a variety of manipulation tasks that are otherwise difficult to program. Examples include, batting, pancake flipping, pouring, pole balancing etc. One of the most challenging class of manipulation tasks is assembly of mating parts. Not surprisingly, the capability to learn assembly skills is highly sought after. 

Problem description
Most skill learning RL methods are of policy search type. In policy search methods, the optimal parameters of a parameterized policy is obtained from an optimization process. Computing a general policy that takes the best action in any possible state is a much harder problem than planning a sequence of actions from a single state. On the other hand, while the policy provides robustness to uncertainties, planning cannot cope with any deviations from the plan. Online planning or model predictive control (MPC) is a method in which the best of both worlds come together. Instead of computing a policy offline, a plan is computed in an online manner at every execution step. Only the first action is applied and the rest is discarded. The process is repeated at every time step. The drawback with the online planning method is the high computational cost of planning at every time step. When combined with dynamics model learning, the overall method becomes a reinforcement learning approach. Some of the challenges that we aim to tackle in this thesis are: trading off planning horizon versus computational cost, planning under uncertain dynamics model, and incorporating prior information of the task instead of completely relying on learning the dynamics. Our application will be an assembly task in which an ABB YuMi robot will insert one part into another part. 

Purpose and aims
The objective of this thesis is to develop a skill learning method under the framework of RL. The robot should be able to demonstrate the learning process by continuously trying to do the insertion while making incremental progress and finally achieve convergence by being able to complete the task successfully in a few consecutive trials.

The work will include the following tasks:

  1. Conduct literature review on RL based skill learning and MPC.
  2. Formulate a method for online planning that utilizes the uncertainties of the learned dynamics model. Model learning algorithm can be assumed to be given.
  3. Develop a strategy for combining offline learning from simulation and online planning.
  4. Evaluate the method on simulated tasks and also a real robot.

 

We are searching for a highly motivated student from master programs such as Systems, Control and Robotics, or Machine learning, or a student with a similar background. Knowledge in modeling and control of robotics manipulator is highly advantageous. Any prior exposure to optimal control, MPC, or RL will be valued. A medium to high level of competency in either Python or Matlab is necessary. Masters level knowledge of linear algebra and probability theory is expected and general competence in machine learning will be highly appreciated.

The master student will gain competences within Robotics, Robot Control, Reinforcement learning, Optimization, Optimal Control, etc. Note that the student will work in ABB Corporate Research in Västerås and compensation plus accommodation will be provided by the company. This project is defined within the context of an ongoing PhD project and therefore, the student can expect a high level of research environment and support, including software and systems. Prospective PhD student will be given preference. It may also be possible to do this project at RPL but the decision will be taken on a case by case basis.

Contact:  Shahbaz Khader, +46725305968, shahbaz.khader@se.abb.com, ABB Corporate Research