Congratulations, we have covered the fundamentals of joining and merging data in both SQL and Python using pandas DataFrames. Throughout the process, we discussed practical examples of which joins to use along with why you should use them against user hits data. Enriching our data by blending multiple data tables allows deeper analysis and the ability to answer many more questions about the original single data source. After learning about joins and the merge() function, we uncovered the advantages and disadvantages of data aggregation. We walked through practical examples of using the groupby feature in both SQL and DataFrames. We walked through the differences between statistical functions and mean, median, and mode, along with tips for finding outliers in your data by comparing results to a normal distribution bell curve.
In our next chapter, we will be heading back to using plot libraries and visualizing data.