Index
𝜖-greedy, 917–921
2 × 2 cube model, 1121
3 × 3 cube model, 1124
A2C baseline, 813
implementation, 814, 815
results, 816–819
video recording, 820
A2C method, 770, 771
implementing, 772, 773, 775–777
models, using, 780, 782
results, 778, 779
videos, recording, 781, 783
A2C on Pong, 605–611
A3C, with data parallelism, 622
results, 623
A3C, with gradient parallelism, 624
implementation, 625–630
results, 631
ACKTR, 837
implementation, 838
results, 839, 840
action selector, 330, 332...