Selecting and organizing columns
We explore several ways to select one or more columns from your DataFrame in this recipe. We can select columns by passing a list of column names to the []
bracket operator, or by using the pandas-specific loc
and iloc
data accessors.
When cleaning data or doing exploratory or statistical analyses, it is helpful to focus on the variables that are relevant to the issue or analysis at hand. This makes it important to group columns according to their substantive or statistical relationships with each other, or to limit the columns we are investigating at any one time. How many times have we said to ourselves something like, “Why does variable A have a value of x when variable B has a value of y?” We can only do that when the amount of data we are viewing at a given moment does not exceed our perceptive abilities at that moment.
Getting ready…
We will continue working with the National Longitudinal Survey (NLS) data in...