Summary
In this chapter, we learned how to distribute the training of our models across multiple machines and devices, using TensorFlow clusters. We also learned model parallel and data parallel strategies for the distributed execution of TensorFlow code.
The parameter updates can be shared with synchronous or asynchronous updates to parameter servers. We learned how to implement code for synchronous and asynchronous parameter updates. With the skills learned in this chapter, you will be able to build and train very large models with very large datasets.