Training machine learning models with TPOT and Dask
Optimizing machine learning pipelines is, before everything, a time-consuming process. We can shorten it potentially significantly by running things in parallel. Dask and TPOT work great when combined, and this section will teach you how to train TPOT models on a Dask cluster. Don't let the word "cluster" scare you, as your laptop or PC will be enough.
You'll have to install one more library to continue, and it is called dask-ml
. As its name suggests, it's used to perform machine learning with Dask. Execute the following from the Terminal to install it:
pipenv install dask-ml
Once that's done, you can open up Jupyter Lab or your favorite Python code editor and start coding. Let's get started:
- Let's start with library imports. We'll also make a dataset decision here. This time, we won't spend any time on data cleaning, preparation, or examination. The goal is to have...