EDA and visualization libraries in Python
Within Python, there are several EDA and visualization packages, and we will cover some of the top packages here. We already saw some of what we can do with pandas. For further EDA, we will look at the pandas-profiling
Python library, which can automate EDA plots and statistics for us with a few lines of code. However, for better visualizations that we might use in reports or presentations, we should make things a little more polished and precise with custom visualizations.
For more polished plots, we can use one of several plotting packages in Python depending on our use case. The original base-level plotting package in Python is matplotlib
. It is essentially the most basic way to make plots and visualizations in Python, and although it's simple to use for small tasks, it becomes difficult for complex plots. For example, plotting time series, adding text annotations, and combining multiple subplots can all make matplotlib
a pain...