Till KTH:s startsida Till KTH:s startsida

Ändringar mellan två versioner

Här visas ändringar i "Master thesis proposals - external" mellan 2019-10-31 09:17 av Patric Jensfelt och 2019-11-01 15:41 av Patric Jensfelt.

Visa < föregående | nästa > ändring.

Master thesis proposals - external

Volumental Volumental (https://www.volumental.com/) is a spin-off from RPL, founded in 2012 by phd students Alper Aydemir, Rasmus Göranssson and Miroslav Kobetski. Up until now the business has been focused on 3D scanners for feet and how to use the data from such scans to help customers find shoes that fit them. More than 1500 scanners have been sold world-wide, and are now a central piece of equipment in stores by, for example, New Balance, Fleet Feet, Wintersteiger and Bauer. One of next step in the (r)evolution that is transforming the retail is to extend the help that the customer gets in to the fitting process. This can be done in several ways and we are looking for people that want to be part of that journey.

Some example questions: What shoes to recommend give the 3D model of a shoe? How to improve this if you also know what other shoes that the person owns? How to improve the 3D modelling procesHow to provide an accurate scan on a mobile phone, using deep learning and computer vision techniques? How to build a machine learning based recommendation system that is scalable to multiple millions of requests across the globe? How to design better products witselfh the 3D data? How to identify special cases that could lead to failure and this less trust in the system?

We are looking for you who are most likely a master student from systems, control and robotics, machine learning or computer science and looking for a thesis project in the real world but with people that have a strong academic background and connection to give good supervision for the work and a good understanding for what a thesis project is about. You have experience in programming so that this is not an obstacle for the work. Depending on what you want to do you also bring experience in that area. If you want to work with 3D modelling of feet you have probably studied estimation worked with sensors. If you want to work with advanced recommendation systems you know your way around machine learning and the associated tools.

Contact: Alper Aydemir, Co-founder and CTO of Volumental

Scania
* https://www.scania.com/group/en/available-positions/?job_id=15399
* Trajectory Prediction of Surrounding Vehicles and the Effect of Varying Host Vehicle Maneuvers

* https://www.scania.com/group/en/available-positions/?job_id=15099&kw=
* https://www.scania.com/group/en/available-positions/?job_id=15100&kw=
* https://www.scania.com/group/en/available-positions/?job_id=15101&kw=
* Collision Risk Prediction for Autonomous Driving Systemshttps://www.scania.com/group/en/available-positions/?job_id=15224&kw=


* Behavior Prediction of Surrounding Pedestrians for Autonomous Drivinghttps://www.scania.com/group/en/available-positions/?job_id=15220&kw=


* Iterative Learning Control for Repetitive Tasks in Autonomous Heavy-Duty Vehicleshttps://www.scania.com/group/en/available-positions/?job_id=15223&kw=


* Motion Planning for Self-Driving Vehicleshttps://www.scania.com/group/en/available-positions/?job_id=15222&kw=


* Formation Control Strategy for Autonomous Snow Clearancehttps://www.scania.com/group/en/available-positions/?job_id=15221&kw=


Reinforcement Learning at Flexible Alternating Current Transmission Systems You will be part of the Business Unit Grid Integration, located in Västerås. FACTS (Flexible Alternating Current Transmission Systems) technologies provide more power and control in existing AC as well as green-field networks and have minimal environmental impact. With a complete portfolio and in-house manufacturing of key components, ABB is a reliable partner in shaping the grid of the future. Please find out more about our world leading technology at www.abb.com/facts.

Problem description

At FACTS you will aid in creating a more sustainable future. The increase of available measurements, growth of real time processing capacity and communication abilities is changing the opportunities in the power system landscape. There is a potential to utilize more measurement data, Reinforcement Learning and a FACTS device for optimization purposes.

The task would include: • Create suitable test networks• Investigate and develop prototype algorithm based on latest advancements in deep Reinforcement Learning• Train, tune and test the algorithms on the networks and demonstrate policy optimality

Requirements

We are looking for you who are studying a university master program within a relevant technical area along with an interest within artificial intelligence or Reinforcement Learning. Experience in Reinforcement Learning or programming is positive. We aim to start the thesis in early September.

Model-based Model-free Reinforcement Learning for Robotics Manipulation BackgroundRecent advances in artificial intelligence has enabled machines to compete with humans even in the most difficult of domains. Google Deepmind's AlphaGo is a case in point. Similar approaches of reinforcement learning (RL) have been tried in the robotics community on problems of skill learning. By skill we mean a sensorimotor policy (control policy) that can perform a single continuous-time task. Numerous successes in skill learning have been reported for a variety of manipulation tasks that are otherwise difficult to program. Examples include, batting, pancake flipping, pouring, pole balancing etc. One of the most challenging class of manipulation tasks is assembly of mating parts. Not surprisingly, the capability to learn assembly skills is highly sought after.

Problem descriptionRL can be divided into model-based and model-free methods. In model-based methods, the algorithm learns a dynamics model of the manipulation task and utilizes it to optimize the policy. Contrary to this, in model-free RL (policy search), the policy is often directly optimized without the intermediate step of model learning. The trade off here is between number of trials (sample efficiency) and model bias. While mode-based methods are sample efficient, model-free methods do not suffer from model bias. We propose a hybrid approach that has benefits of both methods in it. It employs a global black box optimization method called Bayesian optimization (BO) to learn the policy in a fundamentally model-free way, but at the same time uses a learned model to guide the process. We will exploit the fact that BO does not require a cost function for the learning process. Our application will be an assembly task in which an ABB YuMi robot will insert one part into another part.

Purpose and aimsThe objective of this thesis is to develop a skill learning method under the framework of RL. The robot should be able to demonstrate the learning process by continuously trying to do the insertion while making incremental progress and finally achieve convergence by being able to complete the task successfully in a few consecutive trials.

The work will include the following tasks:


* Conduct literature review on RL based skill learning and BO.
* Formulate a strategy for utilizing a learned dynamics model for guiding the BO. Model learning algorithm can be assumed to be given.
* Set up either MuJoCo or Bullet simulation environment. Implement a simpler task of inverted pendulum and then the main insertion task.
* Develop a parameterized policy (not necessarily deep network) and implement the BO based RL algorithm including the results of Step 2.
* Evaluate the method on a real robot and draw conclusions about the hybrid method.
We are searching for a highly motivated student from master programs such as Systems, Control and Robotics, or Machine learning, or a student with a similar background. Knowledge in modeling and control of robotics manipulator is highly advantageous. Any prior exposure to Gaussian process regression, RL or BO will be valued. A medium to high level of competency in either Python or Matlab is necessary. Masters level knowledge of linear algebra and probability theory is expected and general competence in machine learning will be highly appreciated.

The master student will gain competences within Robotics, Robot Control, Reinforcement learning, Bayesian optimization, Gaussian process, etc. Note that the student will work in ABB Corporate Research in Västerås and compensation plus accommodation will be provided by the company. This project is defined within the context of an ongoing PhD project and therefore, the student can expect a high level of research environment and support, including software and systems. Prospective PhD student will be given preference. It may also be possible to do this project at RPL but the decision will be taken on a case by case basis.

Contact: Shahbaz Khader, +46725305968, shahbaz.khader@se.abb.com, ABB

Online Planning Based Reinforcement Learning for Robotics Manipulation BackgroundRecent advances in artificial intelligence has enabled machines to compete with humans even in the most difficult of domains. Google Deepmind's AlphaGo is a case in point. Similar approaches of reinforcement learning (RL) have been tried in the robotics community on problems of skill learning. By skill we mean a sensorimotor policy (control policy) that can perform a single continuous-time task. Numerous successes in skill learning have been reported for a variety of manipulation tasks that are otherwise difficult to program. Examples include, batting, pancake flipping, pouring, pole balancing etc. One of the most challenging class of manipulation tasks is assembly of mating parts. Not surprisingly, the capability to learn assembly skills is highly sought after.

Problem descriptionMost skill learning RL methods are of policy search type. In policy search methods, the optimal parameters of a parameterized policy is obtained from an optimization process. Computing a general policy that takes the best action in any possible state is a much harder problem than planning a sequence of actions from a single state. On the other hand, while the policy provides robustness to uncertainties, planning cannot cope with any deviations from the plan. Online planning or model predictive control (MPC) is a method in which the best of both worlds come together. Instead of computing a policy offline, a plan is computed in an online manner at every execution step. Only the first action is applied and the rest is discarded. The process is repeated at every time step. The drawback with the online planning method is the high computational cost of planning at every time step. When combined with dynamics model learning, the overall method becomes a reinforcement learning approach. Some of the challenges that we aim to tackle in this thesis are: trading off planning horizon versus computational cost, planning under uncertain dynamics model, and incorporating prior information of the task instead of completely relying on learning the dynamics. Our application will be an assembly task in which an ABB YuMi robot will insert one part into another part.

Purpose and aimsThe objective of this thesis is to develop a skill learning method under the framework of RL. The robot should be able to demonstrate the learning process by continuously trying to do the insertion while making incremental progress and finally achieve convergence by being able to complete the task successfully in a few consecutive trials.

The work will include the following tasks:


* Conduct literature review on RL based skill learning and MPC.
* Formulate a method for online planning that utilizes the uncertainties of the learned dynamics model. Model learning algorithm can be assumed to be given.
* Develop a strategy for combining offline learning from simulation and online planning.
* Evaluate the method on simulated tasks and also a real robot.

We are searching for a highly motivated student from master programs such as Systems, Control and Robotics, or Machine learning, or a student with a similar background. Knowledge in modeling and control of robotics manipulator is highly advantageous. Any prior exposure to optimal control, MPC, or RL will be valued. A medium to high level of competency in either Python or Matlab is necessary. Masters level knowledge of linear algebra and probability theory is expected and general competence in machine learning will be highly appreciated.

The master student will gain competences within Robotics, Robot Control, Reinforcement learning, Optimization, Optimal Control, etc. Note that the student will work in ABB Corporate Research in Västerås and compensation plus accommodation will be provided by the company. This project is defined within the context of an ongoing PhD project and therefore, the student can expect a high level of research environment and support, including software and systems. Prospective PhD student will be given preference. It may also be possible to do this project at RPL but the decision will be taken on a case by case basis.

Contact: Shahbaz Khader, +46725305968, shahbaz.khader@se.abb.com, ABB Corporate Research