Summary
In this chapter, we discussed two ways of accelerating computing with LightGBM. The first is large-scale distributed training across many machines using the Python library Dask. We showed how to set up a Dask cluster, how data can be distributed to the cluster using the Dask DataFrame, and how to run LightGBM on the cluster.
Second, we also looked at how to leverage the GPU with LightGBM. Notably, the GPU setup is complex, but significant speed-up can be achieved when it’s available. We also discussed some best practices for training LightGBM models on the GPU.