Reading files in chunks
Python is very good at handling reading and writing files or file-like objects. For example, if you try to load big files, say a few hundred MB, assuming you have a modern machine with at least 2 GB of RAM, Python will be able to handle it without any issue. It will not try to load everything at once, but play smart and load it as needed.
So even with decent file sizes, doing something as simple as the following code will work straight out of the box:
with open('/tmp/my_big_file', 'r') as bigfile: for line in bigfile: # line based operation, like 'print line'
But if we want to jump to a particular place in the file or do other nonsequential reading, we will need to use the handcrafted approach and use IO functions such as seek()
, tell()
, read()
, and next()
that allow enough flexibility for most users. Most of these functions are just bindings to C implementations (and are OS-specific), so they are fast, but their behavior can vary based on the OS we are running...