Using pandas profiling
Pandas profiling is a common library used to profile data. It goes beyond everything we have done so far and produces an HTML report that contains valuable information for both the data scientist and data modeler. This recipe will teach you how to install and use pandas profiling. In this recipe, we will use the DataFrame initialized in the first recipe and use the pandas profiling library to show the rich visual data profiling reports that can be achieved with it.
Getting ready
This recipe uses Azure Databricks. If you are using a trial Azure subscription, you will need to upgrade it to a Pay-As-You-Go subscription. Azure Databricks requires eight cores of computing resources. The trial Azure subscription has only four computing resource cores. If you are using an Enterprise or MSDN Azure subscription, it should contain enough resources for Azure Databricks. It also needs you to be the administrator of the cluster as we'll install some libraries on...