Characteristics of reinforcement learning
- The feedback of a reward signal is not instantaneous. It is delayed by many timesteps
- Sequential decision making is needed to reach a goal, so time plays an important role in reinforcement problems (no IID assumption of the data holds good here)
- The agent's action affects the subsequent data it receives
In reinforcement learning, a little bit of supervision is needed, but much less supervision compared to supervised learning.
The following are a few actual live examples of reinforcement learning problems:
- Autonomous helicopter: The objective of autonomous helicopter is to change its roll, pitch and yaw to control its position by controlling the joystick, pedals, and so on. Sensors send inputs 10 times a second which provide an accurate estimate of position and orientation of the helicopter. The helicopter's job is to receive this input and to control the stick to move accordingly. It is very hard to provide information on what the helicopter needs to...