Swinging up a pendulum using DDPG
Let's learn how to implement the DDPG for the inverted pendulum swing-up task using Stable Baselines. First, let's import the necessary libraries:
import gym
import numpy as np
from stable_baselines.ddpg.policies import MlpPolicy
from stable_baselines.common.noise import NormalActionNoise, OrnsteinUhlenbeckActionNoise, AdaptiveParamNoiseSpec
from stable_baselines import DDPG
Create the pendulum environment using Gym:
env = gym.make('Pendulum-v0')
Get the number of actions:
n_actions = env.action_space.shape[-1]
We know that in DDPG, instead of selecting the action directly, we add some noise using the Ornstein-Uhlenbeck process to ensure exploration. So, we create the action noise as follows:
action_noise = OrnsteinUhlenbeckActionNoise(mean=np.zeros(n_actions), sigma=float(0.5) * np.ones(n_actions))
Instantiate the agent:
agent = DDPG(MlpPolicy, env, verbose=1, param_noise=None, action_noise...