Slicing lexicographically
The .loc
attribute typically selects data based on the exact string label of the index. However, it also allows you to select data based on the lexicographic order of the values in the index. Specifically, .loc
allows you to select all rows with an index lexicographically using slice notation. This only works if the index is sorted.
In this recipe, you will first sort the index and then use slice notation inside the .loc indexer to select all rows between two strings.
How to do it…
- Read in the college dataset, and set the institution name as the index:
>>> college = pd.read_csv( ... "data/college.csv", index_col="INSTNM" ... )
- Attempt to select all colleges with names lexicographically between
Sp
andSu
:>>> college.loc["Sp":"Su"] Traceback (most recent call last): ... ValueError: index must be monotonic increasing or decreasing During handling...