The emulator and the model
In this section, we will cover the process of obtaining the policy that we will deploy on the hardware. As mentioned, we will use a physics emulator (PyBullet in our case) to simulate our robot. I won't describe in detail how to set up PyBullet, as it was covered in the previous chapter. Let's jump into the code and the model definition.
In the previous chapter, we used robot models already prepared for us, like Minitaur and HalfCheetah, which exposed the familiar and simple Gym interface with the reward, observations, and actions. Now we have custom hardware and have formulated our own reward objective, so we need to make everything ourselves. From my personal experiments, it turned out to be surprisingly complex to implement a low-level robot model and wrap it in a Gym environment. There were several reasons for that:
- PyBullet classes are quite complicated and poorly designed from a software engineer point of view. They contain a lot...