What is an index?

The easiest analogy for understanding what an index is to think of the index you would get in a text book.

The index in a text book gives a quicker way of finding data in a large book using a keyword.

Indexes in computer storage like databases are the same thing that just happen to be implemented using data-structures - these data-structures could be stored in memory (in memory index) or on disc.

There are many different ways of creating indexes like btrees or hash tables that have different properties and tradeoffs but in all cases a index typically means that:

  • Some CPU cycles need to be consumed to make the index - i.e. some work needs to be done to scan through all the data and build up an index.

  • Some space needs to be consumed by that index - could just be space in memory or if the index is stored on disc it will take up disc space.

It’s just like an index in a book - it took someone time (CPU) to write the index and it takes up space (pages) in the book. Also indexes can be optimized depending on the purpose. For instance one could build an index for an encyclopedia that addresses the needs a model train enthusiast to find out concepts related to their unique interest - it helps to step back away from the technicalities of how we implement indexes in software to recognize this.