- What is the RL equivalent of a labeled training dataset in supervised learning?
- What is one of the difficulties of not having a standardized set of environments for developing RL algorithms? How does Gym attempt to solve this problem?
- What is the difference between an actuated joint and an unactuated joint?
- What is the benefit of being able to use a single algorithm to solve more than one environment? Explain in two to three sentences.
- What is the importance of being able to solve generalized control problems in robotics motion?
- Briefly describe the relationship between a probability distribution and our current estimation of the likelihood of an event.
- Explain what difference it makes to have a state space available in a contextual bandit problem.
- Describe the differences in the results of A/B testing versus multi-armed bandit testing in two to three sentences. ...