IO Objects¶
Indices and tables¶
Definition¶
An IO object represents an open file.
The location of that file may be one of:
- On-disk
- Standard input
- Standard output
- Standard error
- In-memory buffer
- Sockets
- Pipes
- etc…
IO objects are also referred to as, “file objects”, “file-like objects” or “streams”.
There are 3 basic types of IO objects you can work with:
- Text
Reads and writes string objects.
Bytes from the backing store are decoded back to string on read and encoded into bytes on write. Newlines are optionally translated.
- Binary
Reads and writes bytes objects.
No encoding, decoding or newline translation is performed.
- Raw
Also called unbuffered IO.
This is a low-level building block class. It is rarely needed and won’t be discussed further.
The io library contains the API definitions for the different types of IO objects.
Python pre-initializes 3 text streams, for you:
sys.stdin- standard input streamsys.stdout- standard output streamsys.stderr- standard error stream
Common IO Operations¶
Stream Opening¶
The canonical way to open a file is with the built-in function open(). Some, but not all, of its arguments are described below:
-
open(file, mode='r', encoding='', newline='')¶
file is a string or bytes object giving the path name to the file to open.
mode is optional and is a sequence of 1 or more characters that specifies how the file is opened. Refer to Table 15
| Character | Meaning |
|---|---|
| ‘r’ | Open for reading (default) |
| ‘w’ | Open for writing, truncating the file first |
| ‘x’ | Like ‘w’ but fails if file already exists |
| ‘a’ | Open for appending to the end of the file |
| ‘b’ | Open in binary mode |
| ‘t’ | Open in text mode (default) |
| ‘+’ | Open for reading and writing (updating) |
encoding names the encoding to use to encode/decode the file (e.g. ‘utf-8’). The codecs module lists all the valid encodings you can use. If not given, uses the current locale encoding. Only used with text files.
newline controls how line seperators are handled. By default universal newlines mode is enable, which means:
- On input, lines can end in ‘\n’, ‘\r’ or ‘\r\n’ and they will be translated into ‘\n’.
- On output, any ‘\n’ is translated into the OS default line separator.
If this behavior is unsuitable for you, look into the newline argument to open().
Below are some typical calls to open():
To open a file and read text from it using universal newlines:
>>> fh = open('foo.txt')To open a file and write bytes into it:
>>> fh = open('foo.bin', mode='bw')
Tip
There are many scenarios when dealing with files for exceptions (specifically OSError) to be generated. You should always wrap file operations in a try or with constructs to avoid these exceptions from crashing your script.
Once the stream is open, you can get the name of it (i.e. the file name) from the name attribute:
>>> fh = open('foo.txt')
>>> fh.name
'foo.txt'
Stream Iterating¶
Text and binary streams support iterating over their lines using a for loop. The line separator for binary streams is always ‘\n’; for text streams it depends on the newline argument to open().
For example, assuming the text file foo.txt has the following content:
The first line.
The second line.
The third line.
And done!
You can iterate over the lines of the file:
>>> fh = open('foo.txt')
>>> for line in fh:
... print(line, end='')
The first line.
The second line
The third line.
And done!
If the binary file foo.bin has the following content:
\x10\x11\n\x12\x13
You can iterate over the bytes of the file:
>>> fh = open('foo.bin', mode='br')
>>> for byte in fh:
... print(byte)
b'\x10\x11\n'
b'\x12\x13'
Stream Flushing¶
By default, streams opened using open() are buffered and may not appear on disk immediately. To force the data out to disk, call flush() on the file file object, as in:
>>> fh = open('bar.txt', mode='w')
>>> fh.write('Hello World')
11
>>> fh.flush()
Stream Closing¶
A stream remains open until you close it by calling close on the file handle. This will flush and close the stream.
You can call close as many times as you want on a stream. However, calling any other operations on a closed stream raise a ValueError.
When using the try construct, put the call to close() in the finally section so it always gets called.
When using the with construct, close() is always called for you, even if there is an exception.
You can check the closed attribute to see if the stream is already closed.
For example:
>>> fh = open('bar.txt', mode='w')
>>> fh.write('Hello World')
11
>>> fh.close()
>>> fh.closed
True
Text Stream Operations¶
The text stream defines the following operations. For the examples that follow, assume the text file foo.txt has the following content:
The first line.
The second line.
The third line.
And done!
-
read(size)¶ Read and return at most
sizecharacters from the stream as a single str. If size is negative orNone, reads until EOF.>>> fh = open('foo.txt') >>> fh.read() 'The first line.\nThe second line\nThe third line.\nAnd done!' >>> fh.seek(0) # Go back to start of stream 0 >>> fh.read(6) 'The fi' >>> fh.read(6) 'rst li'
-
readline(size=-1)¶ >>> fh = open('foo.txt') >>> fh.readline() 'The first line.\n' >>> fh.readline() 'The second line\n'
Tip
It’s generally better to use a
forloop and iterate over the lines then to use thereadline()method.
-
readlines(hint=-1)¶ Read and return a list of lines from the stream.
hintcan be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceedshint.>>> fh = open('foo.txt') >>> fh.readlines() ['The first line.\n', 'The second line\n', 'The third line.\n', 'And done!'] >>> fh.seek(0) # Go back to start of stream 0 >>> fh.readlines(30) ['The first line.\n', 'The second line\n']
Tip
It’s generally better to use a
forloop and iterate over the lines then to use thereadline()method.
-
writable()¶ Return
Trueif the stream supports writing.>>> fh = open('bar.txt', mode='w') >>> fh.writable() True
-
write(s)¶ Write the string
sto the stream and return the number of characters written.>>> fh = open('bar.txt', mode='w') >>> fh.write('Hello World') 11 >>> fh.close()
The text file
bar.txtcontains the following:Hello World Goodbye World
-
writelines(lines)¶ Write a list of lines to the stream. Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
>>> fh = open('bar.txt', mode='w') >>> msg_lines = ['Hello World\n', 'Goodbye World\n'] >>> fh.writelines(msg_lines) >>> fh.close()
The text file
bar.txtcontains the following:Hello World Goodbye World
Text streams don’t support full random access. However, they do allow you to query the current position in the stream and return to the start of the stream.
-
seek(offset)¶ Change the stream position to the given
offset. Don’t assumeoffsetis in bytes or characters (which makes this unsuitable for true random access). Return the new absolute position as an opaque number.>>> fh = open('foo.txt') >>> fh.read(6) 'The fi' >>> fh.read(6) 'rst li' >>> fh.seek(0) # Go back to start of stream 0 >>> fh.read(6) 'The fi'
-
tell()¶ Return the current stream position as an opaque number. The number does not usually represent a number of bytes in the underlying binary storage.
>>> fh = open('foo.txt') >>> fh.read(6) 'The fi' >>> fh.tell() 6
Binary Stream Operations¶
The binary stream defines the following operations. For the examples that follow, assume the text file foo.bin has the following content:
\x10\x11\x12\x13\n\x14\x15\x16\x17
-
peek([size])¶ Return bytes from the stream without advancing the position. At most one single read on the raw stream is done to satisfy the call. The number of bytes returned may be less or more than requested.
>>> fh = open('foo.bin', mode='br') >>> fh.peek() b'\x10\x11\x12\x13\n\x14\x15\x16\x17' >>> fh.peek() b'\x10\x11\x12\x13\n\x14\x15\x16\x17'
-
read([size]) Read and return
sizebytes, or ifsizeis not given or negative, until EOF or if the read call would block in non-blocking mode.>>> fh = open('foo.bin', mode='br') >>> fh.read() b'\x10\x11\x12\x13\n\x14\x15\x16\x17' >>> fh.read() b'' >>> fh.seek(0) # Go back to start of stream 0 >>> fh.read(2) b'\x10\x11'
-
write(b) Write the bytes-like object,
b, and return the number of bytes written.>>> fh = open('bar.bin', mode='bw') >>> fh.write(b'\x20\x21\x22\x23') 4 >>> fh.close()
-
seek(offset[, whence]) Change the stream position to the given byte
offset.offsetis interpreted relative to the position indicated bywhence. The default value forwhenceis SEEK_SET. Values for whence are:- SEEK_SET or 0 – start of the stream (the default);
offsetshould be zero or positive - SEEK_CUR or 1 – current stream position;
offsetmay be negative - SEEK_END or 2 – end of the stream;
offsetis usually negative
Return the new absolute position.
>>> fh = open('foo.bin', mode='br') >>> fh.seek(2) 2 >>> fh.read(2) b'\x12\x13' >>> fh.seek(3, io.SEEK_CUR) 7 >>> fh.read(2) b'\x16\x17' >>> fh.seek(-4, io.SEEK_END) 5 >>> fh.read(2) b'\x14\x15'
- SEEK_SET or 0 – start of the stream (the default);
-
tell() Return the current stream position.
>>> fh = open('foo.bin', mode='br') >>> fh.tell() 0 >>> fh.seek(2, io.SEEK_CUR) 2 >>> fh.tell() 2 >>> fh.read(2) b'\x12\x13' >>> fh.tell() 4
Try it!
Try the following:
Create a file and write the following multi-lined message into it:
"Hello World Viva la Pluto"
Add another line to the same file without erasing the old message:
"supercalifragilisticexpialidocious"
Close the file.
Re-open the file in read mode and print the contents.