Modules and Packages

Indices and tables

Introduction

Python defines the following two containers for code reuse:

  • Modules, which group related functions together.
  • Packages, which group related modules (and additional sub-packages) together in a named hierarchy.

Note

In reality, Python only has one object to represent organizational units of code, which is the module object. As a result, all packages are actually modules (but not all modules are packages).

Packages and modules can come from more than just the filesystem; they can reside in zip files, a database, an http endpoint on the network, in the Windows registry etc. In other words, any location you can search and access.

Tip

It is common for the names of built-in modules/packages to be all lower-case, but, for the first letter of custom modules/packages to be upper-case.

Modules

Modules are typically source code that is found in a .py file.

Programs are also source code that is found in a .py file.

Sounds like the same thing…

The distinction is in how each type is used

  • A module is designed to be imported into another program, whereas,
  • A program is meant to be run standalone.

Defining a module is covered in Program Structure

Modules can contain a docstring which describes the module. The docstring contents are placed in the __doc__ attribute of the module object that is created. For example, given the following code in My_module.py:

"""The docstring for My_module
"""

# Print a statement when the module first imported.
print("Hi from " + __name__)

# Define a function with the same name as the module.
def my_module_fn():
    print("Hi from my_module_fn")

Importing it will populate the __doc__ attribute as:

>>> import my_module
>>> my_module.__doc__
 'The docstring for My_module\n'

Packages

Python defines two types of packages:

  • Regular packages: typically implemented as a directory containing modules, and optionally, additional sub-packages (as sub-directories).
  • Namespace packages: a package where the portions of the entire package are spread out and reside in different locations. These will not be discussed in this course.

Packages can contain modules and sub-packages. The named hierarchy of a packages uses the . separator to separate the child modules and subpackages from the parent packages.

Python identifies a directory as a package if it contains an __init__.py file.

Code in the packages __init__.py file is execute when the package is imported for the first time. Any names defined in it are bound to names in the packages namespace (i.e. under the package name).

The __init__.py file can contain one or more of the following:

  • Nothing (i.e. empty), in which case nothing happens when the package is imported.
  • Contain a docstring which describes the package (the docstring contents are placed in the __doc__ attribute of the module object that is created).
  • Contain an attributed called __all__ pointing to a list of string names of sub-packages to import if a specific type of import statement is used.
  • Contain code to initialize names and other parts of the package when it is first imported.

The module object created when the package is imported is a module that has a __path__ attribute defined in it, where-as a regular module object has no such __path__ attribute.

  • The __path__ attribute provides a list of paths to search for modules and sub-packages.

Importing

You gain access to the code in a module or package by importing it using the import machinery. This machinery is quite sophisticated and extensible. However, the simplest and most canonical way is by using the import statement. Just keep in mind there are other ways.

Tip

It is customary, but not required, to put all import statements at the top of the file with standard modules first, then 3rd party modules and then your own custom modules.

The import system takes the fully qualified name of the module or package (i.e. a hierarchical dotted path to the item) being imported. For example:

pkg_a.pkg_b.module_b0

When a module or package is imported, the import system searches for, and then loads, the module or package (including any of the parent packages listed).

The search is conducted sequentially in turn according to the following steps until the module / package is found:

  • Looking in the module cache (sys.modules) for previously imported modules/packages and their parents. If present, the cached module is used.

  • Searches for names built into the interpreter (a.k.a ‘built-ins’)

  • File system paths on sys.path, for the source code file or package directory, are searched. If present, the module / package is loaded. The sys.path name is initialized as follows:

    • The directory containing the input script (or the current directory when no file is specified)
    • From the list of directory names given in environment variable, PYTHONPATH
    • From an installation-dependent default set of values, which typically gives the locations of the standard library modules / packages available to all users, as well as 3rd part modules that are only available to the current user.

When the package or module is found, the import system binds a name, in the local scope of where the import statement being executed exists, to point to the item being imported.

If a module or package can’t be found during import, Python throws an ImportError exception.

Once a package / module is loaded ANYWHERE in the source code, all subsequent imports use that module/package since the search will be satisfied by the module cache. This means you can’t re-load / re-fresh a module / package just by importing it a second time (you must use importlib.reload()).

Import Examples

The import system offers numerous ways to specify what to import. The full rules are given in the import syntax. Using the following example package hierarchy we can look at some common examples of its use:

My_module.py --> A module not part of a package
Pkg_top/
    __init__.py
    Module_top.py
    Sub_pkg_a/
        __init__.py
        Module_a0.py
        Module_a1.py
    Sub_pkg_b/
        __init__.py
        Module_b0.py
        Module_b1.py
    Sub_pkg_c/
        __init__.py
        Module_c0.py
        Module_c1.py

The above layout has the following pieces:

  • A standalone module (My_module) which is not part of any package.

  • A top-level package (Pkg_top) which is the start of the named hierarchy.

  • Inside the top-level package there are 3 sub sub-packages (Sub_pkg_a, Sub_pkg_b, Sub_pkg_c).

  • At all levels of the hierarchy, there are __init__.py files to identify the directories as being Python packages. All the __init__.py files use the same general format and declarations inside. For example:

    """The docstring for pkg_a
    """
    
    # Print a statement when the package first imported.
    print("Hi from " + __name__)
    
    # A name defined and initialized by the package
    NAME_FROM_SUB_PKG_A = False
    
  • At all levels of the hierarchy, there are py files, which are the importable modules. The importable modules all use the same general format and declarations inside. For example:

    """The docstring for My_module
    """
    
    # Print a statement when the module first imported.
    print("Hi from " + __name__)
    
    # Define a function with the same name as the module.
    def my_module_fn():
        print("Hi from my_module_fn")
    

Importing a module directly

>>> import My_module
Hi from My_module
>>> dir()
['In'
 'My_module',
 ...,
 'quit']
>>> dir(My_module)
['__builtins__',
 ...,
 '__spec__',
 'my_module_fn']
>>> My_module.my_module_fn()
Hi from my_module_fn

Importing a top-level package directly

>>> import Pkg_top
Hi from Pkg_top
>>> dir()
['In',
 ...,
 'Pkg_top',
 'quit']
>>> dir(Pkg_top)
['NAME_FROM_PKG_TOP',
 '__all__',
 ...,
 '__spec__']

Importing a module from a package

>>> import Pkg_top.Module_top
Hi from Pkg_top
Hi from Pkg_top.Module_top
>>> dir()
['In',
 'Out',
 'Pkg_top',
 ...,
 'quit']
>>> dir(Pkg_top)
['Module_top',
 'NAME_FROM_PKG_TOP',
 '__all__',
 ...,
 '__spec__']
>>> dir(Pkt_top.Module_top)
['__builtins__',
 ...,
 '__spec__',
 'module_top_fn']
>>> Pkg_top.Module_top.module_top_fn()
Hi from module_top_fn

Importing a sub-package or sub-module directly

>>> import Pkg_top.Sub_pkg_a
Hi from Pkg_top
Hi from Pkg_top.Sub_pkg_a
>>> dir()
['In',
 'Out',
 'Pkg_top',
 ...,
 'quit']
>>> dir(Pkg_top)
['NAME_FROM_PKG_TOP',
 'Sub_pkg_a',
 ...,
 '__spec__']
>>> dir(Pkg_top.Sub_pkg_a)
['NAME_FROM_SUB_PKG_A',
 '__builtins__',
 ...,
 '__spec__']
>>> import Pkg_top.Sub_pkg_a.Module_a0
Hi from Pkg_top
Hi from Pkg_top.Sub_pkg_a
Hi from Pkg_top.Sub_pkg_a.Module_a
>>> dir()
['In',
 'Out',
 'Pkg_top',
 ...,
 'quit']
>>> dir(Pkg_top)
['NAME_FROM_PKG_TOP',
 'Sub_pkg_a',
 ...,
 '__spec__']
>>> dir(Pkg_top.Sub_pkg_a)
['Module_a0',
 'NAME_FROM_SUB_PKG_A',
 ...,
 '__spec__']
>>> dir(Pkg_top.Sub_pkg_a.Module_a0)
['__builtins__',
 ...,
 '__spec__',
 'module_a0_fn']
>>> Pkg_top.Sub_pkg_a.Module_a0.module_a0_fn()
Hi from module_a0_fn

Importing a module into the local namespace

>>> from Pkg_top.Sub_pkg_a import Module_a0
Hi from Pkg_top
Hi from Pkg_top.Sub_pkg_a
Hi from Pkg_top.Sub_pkg_a.Module_a0
>>> dir()
['In',
 'Module_a0',
 ...,
 'quit']
>>> dir(Module_a0)
['__builtins__',
 '__cached__',
 ...,
 'module_a0_fn']
>>> Module_a0.module_a0_fn()
Hi from module_a0_fn

The imported module becomes directly available in local scope without needing to be qualified by the preceding package name(s).

Note

Using this syntax, names in preceding packages are NOT imported.

Importing All Names

Using from package import * or from module import *, will imports all public names. Public names are as follows:

  • If __all__ is defined as a sequence of strings, the strings are considered to be the public API names to import, otherwise,
  • All names defined in the package / module that don’t begin with an underscore _.

For example, recall what Pkg_top.__init__ looks like:

"""The docstring for pkg_top
"""

# Print a statement when the package first imported.
print("Hi from " + __name__)

# A name defined and initialized by the package
NAME_FROM_PKG_TOP = False

__all__ = ['Sub_pkg_a', 'Sub_pkg_b', 'Sub_pkg_c']
>>> from Pkg_top import *
Hi from Pkg_top
Hi from Pkg_top.Sub_pkg_a
Hi from Pkg_top.Sub_pkg_b
Hi from Pkg_top.Sub_pkg_c
>>> dir()
['In',
 'Out',
 'Sub_pkg_a',
 'Sub_pkg_b',
 'Sub_pkg_c',
 ...,
 'quit']

Notice:

  1. NAME_FROM_PKG_TOP was not included in __all__ so it wasn’t automatically imported!
  2. There is no Pkg_top name in the local scope. The sub-packages were brought directly into the local scope.

Tip

Using import * is generally frowned upon because you end up importing a set of unknown names, which dirties up your namespace, possibly re-defining names you already had. For example, importing os.path using * would bring in ~40 names into the local namespace.

Changing the Name Bound to the Import Item

Use the as syntax to change the name referencing the imported module or package:

from pkg_name as pkg

from pkg_name import module as mod

This can be used to avoid name conflicts with existing code and to shorten otherwise long names (i.e. to type less).

For example:

>>> import My_module as FOO
Hi from My_module
>>> dir()
['FOO',
 'In',
 ...,
 'quit']
>>> dir(FOO)
['__builtins__',
 '__cached__',
 ...,
 'my_module_fn']
>>> FOO.my_module_fn()
Hi from my_module_fn

Relative imports

Use 1 or more preceding .’s to access other packages elsewhere in the package hierarchy (a single leading . means the current package):

from "."* import pkg_name

from "."* import module

For example, consider the module Pkg_top.Sub_pkg_c.relative_import with the following import statements in it:

from . import Module_c0
from .. import Sub_pkg_a
from .. import Module_top

Importing relative_import would result in the following:

>>> from Pkg_top.Sub_pkg_c import relative_import
Hi from Pkg_top
Hi from Pkg_top.Sub_pkg_c
Hi from Pkg_top.Sub_pkg_c.Module_c0
Hi from Pkg_top.Sub_pkg_a
Hi from Pkg_top.Module_top
>>> dir()
['In',
 ...,
 'quit',
 'relative_import']
>>> dir(relative_import)
['Module_c0',
 'Module_top',
 'Sub_pkg_a',
 ...,
 '__spec__']

Tip

Intra-package names are nice because you don’t have to know the names of, or type out, the names of intervening packages to get at other packages in the hierarchy.

Warning

Keep in mind, you can only use these WITHIN a package (the parent packages have to already be imported)! So using them in a top level program to import a package at the same level as the program, will not work.

Site Packages

The site-packages directory is where Python places 3rd party packages that have been installed.

It resides as part of Python’s distribution directory unless you are working in a virtual environment.

It is part of the module search path by default so it is checked during package import.

You can add new modules using Pip, your OS package manager or a standalone installer.

On Linux, the location of the site-packages directories are as follows:

  • All users:

    /usr/lib/python3/site-packages

    /usr/lib64/python3/site-packages

  • Local user:

    ~/.local/lib/python3/site-packages

Pip

The pip program pulls packages from a package index and places them in the site-packages directory. The pip program takes care of resolving any additional dependencies that are required by the package being installed.

By default pip fetches from the Python Package Index (Pypi), but, it can be configured to fetch from other indexes or from a version control system.

Pre-built packages are in the wheel format, which is essentially a zip archive of the package contents.

Note

pip refers to packages as a synonym for a distribution (which is a collection of released software to be installed).

pip comes bundled with Python as of Python v3.4. In previous versions it had to be bootstrapped in.

Warning

There are 2 versions of the pip program:

  • pip - Used with Python2
  • pip3 - Used with Python3

Be sure to use pip3 with Python3.

On Linux, will need root access to install for all users. To install for just the current user, use the --user switch.

Below are some common uses(remove --user to install for all users):

  1. Install any version of package

pip3 install --user pkg_name

  1. Install package from a range of acceptable versions

pip3 install --user 'pkg_name>=verA,<=verB'

Note

Quotes are required around the pkg_name/version pair when a version specifier is used.

  1. Install specific version of package

pip3 install --user 'pkg_name==ver'

  1. Install a manually downloaded wheel file

pip3 install --user wheel_file.whl

  1. Install all packages contained in a list of requirements

If a text file of packages (and optionally their required versions) is created that captures all the packages required to run your app, pip can fetch them all automatically. For example, consider the following requirements.txt file:

pytest == 3.4.1   # Need this specific version
requests > 2.10   # Anything after 2.10 is ok

Which can be passed to pip:

pip3 install --user -r requirements.txt

Would cause pip to install what’s inside it and any additional dependencies:

Collecting pytest==3.4.1 (from -r requirements.txt (line 1))
  Downloading pytest-3.4.1-py2.py3-none-any.whl (188kB)
    100% |████████████████████████████████| 194kB 1.7MB/s
Collecting requests>2.10 (from -r requirements.txt (line 2))
  Downloading requests-2.18.4-py2.py3-none-any.whl (88kB)
    100% |████████████████████████████████| 92kB 2.7MB/s
Collecting py>=1.5.0 (from pytest==3.4.1->-r requirements.txt (line 1))
  Downloading py-1.5.2-py2.py3-none-any.whl (88kB)
    100% |████████████████████████████████| 92kB 3.0MB/s
  1. Generate a requirements file based on installed site-packages:

pip3 freeze --user > requirements.txt

  1. Determine which packages are outdated

pip3 list --outdated

  1. Upgrading packages

pip3 install --user --upgrade pkg_name

  1. Upgrading Pip

pip3 install --user --upgrade pip

  1. Remove a package

pip3 uninstall --user pkg_name

This is just a quick overview. Refer to the pip website for the full details.