Pandas indexes allow efficient lookup of values. If indexes did not exist, a linear search across all of our data would be required. Indexes create optimized shortcuts to specific data items using a direct lookup instead of a search process.
To begin examining the value of indexes we will use the following DataFrame of 10000 random numbers:
![](https://static.packt-cdn.com/products/9781787123137/graphics/assets/bed8d3a7-33f4-454b-b624-931dfbb5a230.png)
Suppose we want to look up the value of the random number where key==10099 (I explicitly picked this value as it is the last row in the DataFrame). We can do this using a Boolean selection.
![](https://static.packt-cdn.com/products/9781787123137/graphics/assets/4d145979-47c0-4217-9edf-c0814037ef08.png)
Conceptually, this is simple. But what if we want to do this repeatedly? This can be simulated in Python using the %timeit statement. The following code performs the lookup repeatedly and reports on the performance.
![](https://static.packt-cdn.com/products/9781787123137/graphics/assets/e9f16783-bc4c-4559-8bb5-f08f991c6902.png)
This result states that there are 1,000 executions performed three times, and the fastest of those three took lookup...