Calculating boolean statistics
When first getting introduced to boolean Series, it can be informative to calculate basic summary statistics on them. Each value of a boolean series evaluates to 0 or 1 so all the Series methods that work with numerical values also work with booleans.
Getting ready
In this recipe, we create a boolean Series by applying a condition to a column of data and then calculate summary statistics from it.
How to do it...
- Read in the
movie
dataset, set the index to the movie title, and inspect the first few rows:
>>> movie = pd.read_csv('data/movie.csv', index_col='movie_title') >>> movie.head()
- Determine whether the duration of each movie is longer than two hours by using the greater than comparison operator with the
duration
Series:
>>> movie_2_hours = movie['duration'] > 120 >>> movie_2_hours.head(10) movie_title Avatar True Pirates of the Caribbean: At World's End True Spectre ...