Warm starts with demonstrations
A popular technique to demonstrate the agent a way to success is to train it on data that is coming from a reasonably successful controller, such as humans. In RLlib, this can be done via saving the human play data from the mountain car environment:
Chapter10/mcar_demo.py
        ...         new_obs, r, done, info = env.step(a)         # Build the batch         batch_builder.add_values(             t=t,             eps_id=eps_id,             agent_index=0,             obs=prep.transform(obs),           ...