Sequence Types - Bytes¶
Indices and tables¶
Bytes¶
MUTABILITY: Immutable
The bytes type handles raw bytes to facilitate manipulating binary data. It holds a sequence of 0 or more 8-bit (i.e. range 0 to 255) unsigned integers. It is the immutable brother of the bytearray type.
Like tuples, you can’t modify individual elements of the array. If this is required, use the bytearray type.
Bytes objects can be created in a variety of ways:
As a bytes literal using the ‘b’ prefix on a single, double or triple quoted string (the same quotation support is present as for strings) and containing printable ASCII characters and/or 8-bit hex escapes . For example:
>>> foo = b'a\x25\xF0' >>> foo b'a%\xf0' >>> type(foo) bytes
Note
Python displays byte values using printable ASCII characters where possible.
Raw bytes literals, using the ‘rb’ prefix:
>>> foo = rb'a\x25\xF0' >>> foo rb'a\x25\xF0' >>> type(foo) bytes
Using the
bytes
constructor to create an empty object. For example:>>> foo = bytes() >>> foo b''
Using the
bytes
constructor to create a zero-filled (i.e. null-filled) bytes objects of specified length. For example:>>> foo = bytes(4) >>> foo b'\x00\x00\x00\x00'
Using the
bytes
constructor to populate a bytes object with an iterable of integers. For example:>>> foo = bytes((1, 3, 5)) >>> foo b'\x01\x03\x05'
Using the
bytes
constructor to encode a string into bytes. This may require encoding each character into more than one byte depending on it’s code point. For example:>>> foo = bytes('Ѱ', encoding='utf8') b'\xd1\xb0'
From regular strings, using the
str.encode()
method, which may require encoding each character into more than one byte depending on it’s code point. For example:>>> 'Ѱ'.encode() b'\xd1\xb0'
Warning
One thing that could bite you when using bytes (pun for bad humor) is that accessing a single byte returns an int object, not a bytes object. For example:
>>> foo = bytes((1, 3, 5))
>>> bar = foo[0]
>>> bar
1
>>> type(bar)
int
Bytes Specific Methods¶
You can use the same methods as the str
class as long as the byte values are encoded using ASCII. If not, you will get a ValueError
. The exceptions to this are the following methods which of the string class which don’t exist on the bytes class:
In addition to the str
methods you are allowed to use, the bytes
class has the following additional methods defined on it:
-
bytes.
decode
(encoding=”utf-8”, errors=”strict”)¶ Return a
str
decoded from the givenbytes
. Defaultencoding
is ‘utf-8’.>>> b'\xd1\xb0'.decode() 'Ѱ'
-
classmethod
bytes.
fromhex
(string)¶ Returns a
bytes
object, decoding the givenstring
object. The string must contain two hexadecimal digits per byte, with ASCII spaces being ignored.>>> bytes.fromhex('48656C6C6F') b'Hello'
-
bytes.
hex
()¶ Return a
str
object containing two hexadecimal digits for each byte in the instance.>>> b'\x48\x65\x6C\x6C\x6F'.hex() '48656c6c6f'
Try it!
Try creating the following objects:
- An empty bytes object.
- A bytes object of 32 bytes, all 0.
- A bytes object from ASCII printable characters.
- A bytes object using character escape sequences.
- A bytes object from a tuple or list, such as [32, 48, 64].
- A bytes object representing the character “ɸ” in UTF-8.
- A bytes object representing the character “ɸ” in UTF-16.
- Get the string given by the following bytes object, b’x48x65x6cx6cx6f’.