/
Indexing Strategies

Indexing Strategies

Key concept to think about when looking at indexing strategies is Separation of Concerns and what the purpose of each index is for.

Here’s some reasons we might need to use indexes:

  • Faster access to sections of a large file of records to facilitate random access.

  • Indexing related to showing related logs for a given message.

  • Speeding up and optimizing searches.

There are lots of choices we can make with indexes:

  • We can choose whether indexes are stored persistently on disc or are generated on the fly in memory

    • If we do store them on disc we can choose how long we want to persist them for. For instance it may make sense not to heavily index older data which is more ‘archived’.

  • We can go for very simple index file structures - for instance if we want to index related entries to a message the structure could be simply a flat binary encoded list of related entries to that record stored in a file with name that maps to the related entry.

  • We can go for classic database index structures like B-trees

 

Related content

What is an index?
What is an index?
More like this
File descriptors and thoughts about async file I/O
File descriptors and thoughts about async file I/O
Read with this
Email is an important form of data
Email is an important form of data
More like this
In journal file recovery how does one know if a message needs to be re-written to say a queue?
In journal file recovery how does one know if a message needs to be re-written to say a queue?
Read with this
How to store lengths of records
How to store lengths of records
More like this
Random access problem with sequential files
Random access problem with sequential files
Read with this