Looping through data with itertuples (an anti-pattern)
In this recipe, we will iterate over the rows of a DataFrame and generate our own totals for a variable. In subsequent recipes in this chapter, we will use NumPy arrays, and then some pandas-specific techniques, to accomplish the same tasks.
It may seem odd to begin this chapter with a technique that we are often cautioned against using. But I used to do the equivalent of looping every day 35 years ago in SAS, and on select occasions as recently as 10 years ago in R. That is why I still find myself thinking conceptually about iterating over rows of data, sometimes sorted by groups, even though I rarely implement my code in this manner. I think it is good to hold onto that conceptualization, even when using other pandas methods that work for us more efficiently.
I do not want to leave the impression that pandas-specific techniques are always markedly more efficient either. pandas users probably find themselves using apply...