Pandas indexes allow efficient lookup of values. If indexes did not exist, a linear search across all of our data would be required. Indexes create optimized shortcuts to specific data items using a direct lookup instead of a search process.
To begin examining the value of indexes we will use the following DataFrame of 10000 random numbers:
Suppose we want to look up the value of the random number where key==10099 (I explicitly picked this value as it is the last row in the DataFrame). We can do this using a Boolean selection.
Conceptually, this is simple. But what if we want to do this repeatedly? This can be simulated in Python using the %timeit statement. The following code performs the lookup repeatedly and reports on the performance.
This result states that there are 1,000 executions performed three times, and the fastest of those three took lookup...