Introduction
Data visualization is the art and science of telling captivating stories with data. Today's developers and data scientists, irrespective of their operational domain, agree that communicating insights effectively using data visualization is very important.
Data scientists are always looking for better ways to communicate their findings through captivating visualizations. Depending on their domain, the type of visualization varies, and often, this means employing specific libraries and tools that will best suit the visualization needs. Thus, developers and data scientists are looking for a comprehensive resource containing quick, actionable information on this topic. The resources for learning interactive data visualization are scarce. Moreover, the available materials either deal with tools other than Python (for example, Tableau) or focus on a single Python library for visualization. This book is designed to be accessible for anyone who is well-versed in Python.
Why Python? While most languages have associated packages and libraries built specifically for visualization tasks, Python is uniquely empowered to be a convenient tool for data visualization. Python performs advanced numerical and scientific computations with libraries such as numpy
and scipy
, hosts a wide array of machine learning methods owing to the availability of the scikit-learn
package, provides a great interface for big data manipulation due to the availability of the pandas
package and its compatibility with Apache Spark, and generates aesthetically pleasing plots and figures with libraries such as seaborn
, plotly
, and more.
The book will demonstrate the principles and techniques of effective interactive visualization through relatable case studies and aims to enable you to become confident in creating your own context-appropriate interactive data visualizations using Python. Before diving into the different visualization types and introducing interactivity features (which, as we will see in this book, will play a very useful role in certain scenarios), it is essential to go through the basics, especially with the pandas
and seaborn
libraries, which are popularly used in Python for data handling and visualization.
This chapter serves as a refresher and one-stop resource for reviewing these basics. Specifically, it illustrates creating and handling pandas
DataFrame, the basics of plotting with pandas
and seaborn
, and tools for manipulating plotting style and enhancing the visual appeal of your plots.
Note
Some of the images in this chapter have colored notations, you can find high-quality color images used in this chapter at: https://github.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/tree/master/Graphics/Lesson1.