Rock-Paper-Scissors is a classic game for testing AI techniques; that's why we'll use this case scenario for the current and following recipes. We will implement what are called bandit algorithms based on the notion of exploring n-armed bandits. It's usually modeled towards a slot machine, but we will study it as an RPS player. The main idea is to get hold of the option that results in a better payoff.
In this recipe, we will learn about the UCB1 algorithm and how it works.