MIT researchers used reinforcement learning to improve brain cancer treatment. Essentially, the reinforcement learning system is trained on a set of data on established treatment regimes for patients, and then ‘learns’ to find the most effective strategy for administering cancer treatment drugs. The important point is that artificial intelligence here can help to find the right balance between administering and withholding the drugs.
In 2018, UK self-driving car startup Wayve trained a car to drive using its ‘imagination’. Real world data was collected offline to train the model, which was then used to observe and predict the ‘motion’ of items in a scene and drive on the road. Even though the data was collected in sunny conditions, the system can also drive in rainy situations adjusting itself to reflections from puddles etc. As the data is collected from the real world, there aren’t any major differences in simulation versus real application.
UC Berkeley researchers also developed a deep reinforcement learning method to optimize SQL joins. The join ordering problem is formulated as a Markov Decision Process (MDP). A method called Q-learning is applied to solve the join-ordering MDP. The deep reinforcement learning optimizer called DQ offers out solutions that are close to an optimal solution across all cost models. It does so without any previous information about the index structures.
OpenAI researchers created a robot hand called Dactyl in 2018. Dactyl has human-like dexterity for performing complex in hand manipulations, achieved through the use of reinforcement learning.
Finally, it’s back to Go. Well, not just Go - chess, and a game called Shogi too. This time, Deepmind’s AlphaZero was the star. Whereas AlphaGo managed to master Go, AlphaZero mastered all three.
This was significant as it indicates that reinforcement learning could help develop a more generalized intelligence than can currently be developed through artificial intelligence. This is an intelligence that is able to adapt to new contexts and situations - to almost literally understand the rules of very different games.
But there was something else impressive about AlphaZero - it was only introduced to a set of basic rules for each game. Without any domain knowledge or examples, the newer program outperformed the current state-of-the-art programs in all three games with only a few hours of self-training.
These were just some of the applications of reinforcement learning to real-world situations to come out of 2018. We’re sure we’ll see more as 2019 develops - the only real question is just how extensive its impact will be.
This AI generated animation can dress like humans using deep reinforcement learning
Deep reinforcement learning – trick or treat?
DeepMind open sources TRFL, a new library of reinforcement learning building blocks