Let's get started with a simple project: estimating the value of π using the Monte Carlo method, which is the core of model-free reinforcement learning algorithms.
A Monte Carlo method is any method that uses randomness to solve problems. The algorithm repeats suitable random sampling and observes the fraction of samples that obey particular properties in order to make numerical estimations.
Let's do a fun exercise where we approximate the value of π using the MC method. We'll place a large number of random points in a square whose width = 2 (-1<x<1, -1<y<1), and count how many points fall within the circle of unit radius. We all know that the area of the square is:
And the area of the circle is:
If we divide the area of the circle by the area of the square, we have the following:
S/C can be...