Summary
In this chapter, we explored how we can use DuckDB in the context of hands-on data analysis. Working with the Melbourne Pedestrian Counting System dataset, we added two more tools to our toolchain – JupySQL and Plotly – that, when combined with DuckDB, enabled us to perform exploratory data analysis, in which we uncovered a range of insights around pedestrian traffic through central Melbourne.
We started by preparing the Melbourne Pedestrian Counting System dataset for analysis and loading it into a persistent DuckDB database. Then, we looked at two open source tools that support effective data analysis within Jupyter Notebooks: JupySQL, which allows us to conveniently run SQL queries in Jupyter Notebooks, and Plotly, a library for producing interactive visualizations, with strong Jupyter Notebook support. With our dataset loaded into DuckDB, and some handy tooling in place, we jumped into performing some exploratory data analysis of the Melbourne Pedestrian...