JSON Module

Indices and tables

Introduction

It is frequently necessary to transfer information between different computing systems, which are usually running programs written by different authors, using different languages. To facilitate the successful transfer of data, it is important to choose a standardize language to communicate with, so that everyone is playing by the same rules. One such standard is JSON (JavaScript Object Notation).

JSON is popular because it is easy for both humans, and computers, to read and write. A good description of JSON can be found here.

Python provides the json package to support sending and receiving JSON data in your application.

JSON is built around 2 fundamental structures:

  • A collection of name:value pairs. In Python, this maps naturally to the dict type.
  • An ordered list of values. In Python, this maps naturally to the list type.

The values that can be communicated share a common subset of the standard ones available in Python:

  • dictionaries
  • lists
  • strings
  • numbers (integers and floats, including Inf and NaN)
  • booleans (true, false)
  • None

On transmission, the JSON format converts each dictionary key to a string of unicode characters.

On reception, the JSON data is parsed and converted back to the original object type(s), using data types from the language of the program that is reading the data.

For the examples in the following sections, we will use the sample value below:

import json
val = {'key_a': 'My string',
       'key_b': ('My', 'tuple'),
       'key_c': ['My', 'list'],
       'key_d': 425.3,
       'key_e': True,
       'key_f': {'key_fa': "Another string",
                 'key_fb': 991.1341
                }
      }

Sending JSON

To send something using JSON, it must be serialized. There are two methods to do this:

  • Using the json.dump() method, which writes directly to a file handle (could be a disk file, network socket etc.), as in:

    with open("output.json", 'w') as fh:
        json.dump(val, fh, indent=4)
    

    Which would produce:

    {
        "key_f": {
            "key_fb": 991.1341,
            "key_fa": "Another string"
        },
        "key_a": "My string",
        "key_c": [
            "My",
            "list"
        ],
        "key_b": [
            "My",
            "tuple"
        ],
        "key_e": true,
        "key_d": 425.3
    }
    
  • Using the json.dumps() method, returns a string directly, as in:

    >>> output = json.dumps(val, indent=4)
    >>> print(output)
    {
        "key_f": {
            "key_fb": 991.1341,
            "key_fa": "Another string"
        },
        "key_a": "My string",
        "key_c": [
            "My",
            "list"
        ],
        "key_b": [
            "My",
            "tuple"
        ],
        "key_e": true,
        "key_d": 425.3
    }
    

Note

The tuple under key_b was converted to a list.

Receiving JSON

To receive data in JSON format, it must be decoded back to objects which the executing software (in our case Python) can understand. There are two methods to do this:

  • Using the json.load() method, which reads directly from a file handle (could be a disk file, network socket etc.). For example, reading in the previously created JSON output file:

    >>> with open("output.json", 'r') as fh:
    ...    in_val = json.load(fh)
    >>> in_val
    {'key_a': 'My string',
     'key_b': ['My', 'tuple'],
     'key_c': ['My', 'list'],
     'key_d': 425.3,
     'key_e': True,
     'key_f': {'key_fa': 'Another string', 'key_fb': 991.1341}}
    

Note

The tuple under key_b was converted to a list when encoded in the earlier example.

  • Using the json.loads() method, which reads JSON data directly from a string, as in:

    >>> in_str = '''{
                        "key_f": {
                            "key_fb": 991.1341,
                            "key_fa": "Another string"
                        },
                        "key_a": "My string",
                        "key_c": [
                            "My",
                            "list"
                        ],
                        "key_b": [
                            "My",
                            "tuple"
                        ],
                        "key_e": true,
                        "key_d": 425.3
                    }'''
    >>> val = json.loads(in_str)
    >>> val
    {'key_a': 'My string',
     'key_b': ['My', 'tuple'],
     'key_c': ['My', 'list'],
     'key_d': 425.3,
     'key_e': True,
     'key_f': {'key_fa': 'Another string', 'key_fb': 991.1341}}
    

JSON Gotchas

  • During encoding, all keys all encoded to strings. However, on decoding, they are not modified. That means, if you use a non-string as key, the object constructed on the receive side will have keys of a different type than what was on the transmit side.

    For example, using integers as keys:

    >>> import json
    >>> my_dict = {0: 'Zero',
                   1: 'One'}
    >>> encoded_data = json.dumps(my_dict, indent=4)
    >>> print(encoded_data)
    {
     "0": "Zero",
     "1": "One"
    }
    

    Regenerating the original object:

    >>> regenerated_object = json.loads(encoded_data)
    >>> print(regenerated_object)
    {'1': 'One', '0': 'Zero'}
    

    The keys are now always strings.

    You need to structure your data to not use integers for keys in the data you are transmitting.

  • Out of the box, JSON does not support storing tuples. Tuples get converted to lists and decoded back to lists (and not tuples).

  • Types that are not native to the JSON format, for the most part, can’t be transmitted. For example, custom classes. However, the user do one of the following to work around this:

    1. Create a new type, that derives from json.JSONEncoder class and feed it to the json.dump() or json.dumps() function via the cls argument.
    2. Create a new function that understands how to serialize objects that otherwise can’t be serialized, and feed it to the json.dump() or json.dumps() function via the default argument.