Comprehensions ============== .. toctree:: :maxdepth: 1 Indices and tables ------------------ * :ref:`genindex` * :ref:`modindex` * :ref:`search` Definition ---------- A comprehension is a construct that loops over a source iterable in a way that allows you to filter out source items that don't match a condition and generate new items from the source items that aren't discarded. Comprehensions implement the map-filter-reduce idiom which previously was done using separate functions: * :py:func:`map`: - Takes a function and an iterable and calls the function on each item in the iterable to produce new items. * :py:func:`filter`: - Takes a function and an iterable and calls the function on each item in the iterable. Only those items where the function returns true are returned. * :py:func:`~functools.reduce`: - Takes a function and an iterable and calls the function on each item to reduce the iterable down to a single value. However, comprehensions are a more succinct way of executing this idiom. The general scheme for creating a comprehension is shown below. ``result_expression ["for" item_list "in" iterable ["if" test_expression]* ]+`` From the scheme: * The nesting of :py:keyword:`for` and :py:keyword:`if` constructs is possible, and this is done from left-to-right (i.e. :py:keyword:`for` loops to the right are nested more deeply than those to the left). * For each variable given in ``item_list``, a value will be extracted from ``iterable``, for each iteration of the loop. Comprehensions execute in their own scope, so, any names you use are not available outside of it. .. _section_heading-List_Comprehension: List Comprehension ------------------ First and foremost, a list comprehension produces ... a list. Shocking. It is a comprehension expression enclosed in ``[]`` brackets, which using the above syntax for comprehensions takes on the following general form: ``[expression for item_list in iterable]`` or ``[expression for item_list in iterable if condition]`` Let's take a simple example of incrementing the value of all items in the list (i.e. map) without using a list comprehension: >>> old_vals = [100, 200, 300, 400, 500, 600, 700] >>> new_vals = [] >>> for item in old_vals: ... new_vals.append(item + 100) >>> print(new_vals) [200, 300, 400, 500, 600, 700, 800] Collapsing the loops allows you to express this as a list comprehension: >>> old_vals = [100, 200, 300, 400, 500, 600, 700] >>> new_vals = [item + 100 for item in old_vals] >>> print(new_vals) [200, 300, 400, 500, 600, 700, 800] Now, lets say you want to limit (i.e. filter) which values are processed so that only values <=500 are processed: >>> old_vals = [100, 200, 300, 400, 500, 600, 700] >>> new_vals = [item + 100 for item in old_vals if item <= 500] >>> print(new_vals) [200, 300, 400, 500, 600] Taking this a bit larger scale, lets say you wanted to get all the even (x,y) coordinate pairs between 0 and 8 in the x and y direction. You could write this as: >>> even_coords_a = [] >>> for x in range(0,8): ... for y in range(0, 8): ... if x % 2 == 0 and y % 2 == 0: ... even_coords_a.append((x,y)) >>> even_coords_a [(0, 0), (0, 2), (0, 4), (0, 6), (2, 0), (2, 2), (2, 4), (2, 6), (4, 0), (4, 2), (4, 4), (4, 6), (6, 0), (6, 2), (6, 4), (6, 6)] But, you can collapse the loops and condition into a single line as follows: >>> even_coords_b = [(x,y) for x in range(0,8) if (x % 2 == 0) for y in range(0, 8) if (y % 2 == 0)] >>> even_coords_b [(0, 0), (0, 2), (0, 4), (0, 6), (2, 0), (2, 2), (2, 4), (2, 6), (4, 0), (4, 2), (4, 4), (4, 6), (6, 0), (6, 2), (6, 4), (6, 6)] You can even fully nest the condition, as follows: >>> even_coords_c = [(x,y) for x in range(0,8) for y in range(0, 8) if (x % 2 == 0 and y % 2 == 0)] >>> even_coords_c [(0, 0), (0, 2), (0, 4), (0, 6), (2, 0), (2, 2), (2, 4), (2, 6), (4, 0), (4, 2), (4, 4), (4, 6), (6, 0), (6, 2), (6, 4), (6, 6)] It is even possible to extract more than one value at a time from the source iterable. Let's suppose you want to transpose that list of coordinates so instead of (x,y) it is (y,x): >>> transposed_coords = [(y,x) for x,y in even_coords_c] >>> transposed_coords [(0, 0), (2, 0), (4, 0), (6, 0), (0, 2), (2, 2), (4, 2), (6, 2), (0, 4), (2, 4), (4, 4), (6, 4), (0, 6), (2, 6), (4, 6), (6, 6)] .. tip:: To make sense of the nesting in a list comprehension, it is sometimes useful to unroll it (mentally or in actuality). To unroll a list comprehension, put colons just prior to each ``for`` and use ``append`` in the inner-most loop to indicate appending to the list being constructed. A final example is a real one taken from some WFLOW code during code review. Consider reading lines that look like the following: "[0..1][4..7][2048]" The task was to break out all the integers in the string. Using string manipulation and loops, you can break out the integers with the following code: >>> bin_string = "[0..1][4..7][2048]" >>> bins = [] >>> for a in bin_string.split("]["): ... for b in a.strip("[]").split(".."): ... bins.append(int(b)) >>> print(bins) [0, 1, 4, 7, 2048] Or as a list comprehension: >>> bin_string = "[0..1][4..7][2048]" >>> bins = [int(b) for a in bin_string.split("][") for b in a.strip("[]").split("..")] >>> print(bins) [0, 1, 4, 7, 2048] There is no right or wrong way to do it. Either is acceptable. .. _section_heading-Set_Comprehension: Set Comprehension ----------------- Similar to a list comprehension, but produces ... a set. I know, mind blown. It is a comprehension expression enclosed in ``{}`` brackets, which using the above syntax for comprehensions takes on a form similar to a list: ``{expression for item_list in iterable}`` or ``{expression for item_list in iterable if condition}`` For example, given a list of names, extract the first character of all those with last names starting with "m": >>> names = ["Lynne Bowers", "Annie Morrison", "Forrest Hart", "Joann Nunez", "Sarah Morgan", "Randolph Merry", "Terri Oliver", "Herman Miles", "Curtis Townsend", "Judith Thompson", "Tony Haynes", "Bobby Newman", "Bradford Briggs", "Howard Mcdaniel", "Lena Beck", "Marguerite Howard", "Tara Scott", "Regina Maxwell", "Terry Mendez", "John Lamb"] >>> {name[0] for name in names if name.split(" ")[1].lower().startswith("m")} {'A', 'H', 'R', 'S', 'T'} As this is a set, there is only one entry to cover both "Herman Miles" and "Howard Mcdaniel". .. _section_heading-Dict_Comprehension: Dict Comprehension ------------------ Similar to a list and set comprehensions, but produces ... a dictionary. The surprise is gone right? It is a comprehension expression enclosed in ``{}`` brackets, which using the above syntax for comprehensions takes on a form similar to a list and set comprehension except that a dict comprehension has 2 expressions, separated by a ``:``, prior to the ``for``/``if`` clause. ``{key_expression:val_expression for item_list in iterable}`` or ``{key_expression:val_expression for item_list in iterable if condition}`` A simple example of binding the list of characters to their index in a dict: >>> {idx:chr(idx+65) for idx in range(0, 27)} {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E', 5: 'F', 6: 'G', 7: 'H', 8: 'I', 9: 'J', 10: 'K', 11: 'L', 12: 'M', 13: 'N', 14: 'O', 15: 'P', 16: 'Q', 17: 'R', 18: 'S', 19: 'T', 20: 'U', 21: 'V', 22: 'W', 23: 'X', 24: 'Y', 25: 'Z', 26: '['} It is common to extract multiple values from the source iterable when dealing with dict comprehensions. For example: >>> {key:val**2 for key,val in [(1,2), (3,4)]} {1: 4, 3: 16} The same example using a :py:class:`dict` as the source iterable: >>> src_dict = {1:2, 3:4} >>> {key:val**2 for key,val in src_dict} {1: 4, 3: 16} .. _section_heading-Reduce_With_Comprehensions: Reduce with Comprehensions -------------------------- In truth, there is no direct support for the reduce function in comprehensions. However, Python supplies several reduce methods that work on the output from comprehensions: * :py:func:`all` - Return :py:obj:`True` if all elements of the iterable are true (or if the iterable is empty). For example:: >>> all([True, True, True]) True >>> all([True, False, True]) False * :py:func:`any` - Return :py:obj:`True` if any element of the iterable is true. If the iterable is empty, return False. For example:: >>> any([True, True, True]) True >>> any([True, False, True]) True * :py:func:`max` - Return the largest item in an iterable or the largest of two or more arguments. >>> max([1, 245, 134, 90824, 829, 2719]) 90824 >>> max(8345, 246, 13981, 4691) 13981 * :py:func:`min` - Return the smallest item in an iterable or the smallest of two or more arguments. >>> min([1, 245, 134, 90824, 829, 2719]) 1 >>> min(8345, 246, 13981, 4691) 246 * :py:func:`sum` - Sums the items of an iterable from left to right and returns the total. Can take an optional start value that defaults to 0. >>> sum([1, 245, 134, 90824, 829, 2719]) 94752 >>> sum([1, 245, 134, 90824, 829, 2719], 10000) 104752 .. admonition:: Try it! :class: TryIt Try the following: * Create a list comprehension that makes a list of the first character of the following words: * Dog * Apple * Pine * Coal * Augment the list comprehension above so that it only includes words 4 characters long.