Optimizing I/O by Reading & Writing in Large Chunks

A foundational concept when it comes to writing efficient I/O is that is more efficient to write a largish chunk of data in a single write call than the same amount of data in many smaller writes.

The reason is that there is overhead that is associated with each write that becomes less if you can write out all the data in a single write call.

For instance if you are writing data to a hard disc there is time associated with “seek” time. This is the time physically to find the location on the hard drive to write the data to. Seek time is slow - writing 10 bytes or 100000 bytes is roughly the same speed.

Another example is TCP/IP. Each packet of data requires overhead of routing information - lots of small packets means lots more overhead for sending the same payload versus sending one large packet. This is the idea behind the Nagle Algorithm.

Another example is writing many rows to a database in one transaction rather than many individual writes.

This idea motivates how both the old queue and the new queue is designed to try to have a design which results in fewer larger writes rather than many small writes.

See CPU bound versus I/O Bound.

Concepts

Optimizing I/O by Reading & Writing in Large Chunks

Analytics

Related content