Index
Symbols
2×2 cube model 763, 764, 765
3×3 cube model
A
A2C
agent, adding 333, 334, 335
using, on Pong 318, 319, 320, 321, 322, 323, 324
using, on Pong results 324, 325, 326, 327
with data parallelism 334
with gradients parallelism 334
A2C method
about 505
implementation 506, 508, 510
models, used for video recording 512
results 510, 511, 512
A3C, with data parallelism
about 336
implementation 336, 338, 339, 340, 341, 342, 343, 344
result 344
A3C, with with gradients parallelism
about 346, 347
implementation 347, 348, 349, 350, 351, 352
results 352
ACKTR
about 616
implementation 617
results 617, 618
actions 10
action selectors 166, 167
action selectors, cases
argmax 166
policy-based 166
action space 22
actor-critic method 638
about 316, 317
advantage 316
considerations 317, 318
Adam algorithm 322
advantage actor-critic (A2C...