Selecting the lowest-budget movies from the top 100
Now that we have covered many of the core pandas algorithms from a theoretical level, we can start looking at more “real world” datasets and touch on common ways to explore them.
Top N analysis is a common technique whereby you filter your data based on how your data performs when measured by a single variable. Most analytics tools have the capability to help you filter your data to answer questions like What are the top 10 customers by sales? or, What are the 10 products with the lowest inventory?. When chained together, you can even form catchy news headlines such as Out of the Top 100 Universities, These 5 Have the Lowest Tuition Fees, or From the Top 50 Cities to Live, These 10 Are the Most Affordable.
Given how common these types of analyses are, pandas offers built-in functionality to help you easily perform them. In this recipe, we will take a look at pd.DataFrame.nlargest
and pd.DataFrame.nsmallest
and...