The Python ecosystem
The Python programming environment has two broad subject areas:
- The language itself
- The extension packages. We can further subdivide the extension packages into:
- The standard library of packages
- The Python ecosystem of yet more extension packages
When we install Python, we install the language plus several hundred extension packages in the standard library. We'll return to the standard library in Chapter 12, Scripts, Modules, Packages, Libraries, and Applications. The Python ecosystem is potentially infinite. The good news is that PyPI makes it relatively easy to locate packages.
The idea of extensibility via add-ons
Python's design includes a small core language that can be extended by importing additional features. The Language Reference Manual describes 20 statements; there are only 19 operators. The idea is that we can have a great deal of confidence that a small language is correctly implemented, complete, and consistent.
The standard library documentation contains 37 chapters, and describes hundreds of extension packages. There are a lot of features available to help us solve our unique problem. It's typical to see Python programs that import numerous packages from the standard library.
We'll see two common variations of the import
statement:
import math
from math import sqrt, sin
The first version imports the entire math
module and creates the module as an object in the global namespace. The various classes and function names within that module must be properly qualified with the namespace to be used. A qualified name will look similar to math.sqrt()
or math.sin()
.
While the second version also imports the math
module, it only introduces the given names into the global namespace. These names do not require qualifiers. We can use sqrt()
and sin()
as if they were built-in functions. The math
module object, however, is not available, since it was not introduced into the global namespace.
An import happens exactly once. Python tracks the imported modules and will not import a module a second time. This allows us to freely import modules as needed without worrying about the order or other obscure dependencies among modules.
For confirmation of this one-time-only rule for imports, try the following:
>>> import this >>> import this
The behavior the second time is different because the module has already been imported once.
Using the Python Package Index – PyPI
Many developers of Python modules will register their work with the PyPI. This is located at http://pypi.python.org/. This is the second place to look for a module that might help solve a particular problem.
The first place to look is always the standard library.
The PyPI web page has a handy search form as well as a browser that shows packages organized under nine different metadata variables. In many cases, a book or blog post may provide a direct path like this: https://pypi.python.org/pypi/Sphinx/1.3b2. This ensures that the proper version can be downloaded and installed.
There are three common ways to download and install software from the PyPI:
- Using
pip
- Using
easy_install
- Manually
Generally, we'll use tools such as pip
or easy_install
for almost all of our installations. Once in a while, however, we may need to resort to a manual installation.
Some modules may involve binary extensions to Python. These are generally C-language-sources, so they must be compiled to be useful. For Windows—where C compilers are rare—it's often necessary to find an .msi
installer that includes prebuilt binaries. For Mac OS X and Linux, the C source may be compiled as part of the installation process.
In the case of large, complex numeric and scientific packages—specifically, numpy
and scipy
—the build process can become quite complex: generally, more complex than pip
or easy_install
can handle. There are many additional high-performance libraries for these packages; the builds include modules in FORTRAN as well as C. In this case, a prebuilt OS-specific distribution is used; pip
isn't part of the process.
Installing additional packages will require administrator privileges. Consequently, we'll show the sudo
command as a reminder that this is required for Mac OS X and Linux. Windows users can simply ignore the presence of the sudo
command.
Using pip to gather modules
The pip
program is part of Python 3.4. It's an add-on for Python3. To use pip
to install a package, we generally use a command such as the following:
prompt$ sudo pip3.4 install some-package
For Mac OS X or Linux, we need to use the sudo
command so that we have administrator privileges. Windows users will leave this off.
The pip
program will search PyPI for the package named some-package
. The installed Python version and OS information will be used to locate the latest-and-greatest version that's appropriate for the platform. The files will be downloaded, and the Python setup.py
file that comes with the package will be run automatically to install it.
For Mac OS X and Linux users, it's helpful to note that the version of Python that is required by the OS doesn't usually have pip
configured. A Mac OS X user with the built-in Python 2.7 and Python 3.4 can generally use the default pip
command without any problems because there won't be a version of pip
configured for Python 2.
In the case where someone has Python 3.3 and Python 3.4, and has installed pip
for Python 3.3, they will have to choose which version they want to work with. Using the commands pip3.3
or pip3.4
will use one of the pip
commands configured for the given version of Python. The default pip
command may link to whichever version was installed last-something we shouldn't guess at.
The pip
program has a number of additional features to uninstall packages and track which packages have been added to the initial Python installation. The pip
program can also create installable packages of your new creation.
Using easy_install to add modules
The easy_install
package is also part of Python 3.4. It's a part of the setuptools
package. We use easy_install
like this to install a package:
prompt$ sudo easy_install-3.3 some_package
For Mac OS X or Linux, we need to use the sudo
command so that we have administrator privileges. Windows users will leave this off.
The easy_install
program is similar to pip
—it will search PyPI for the package named some-package
. The installed Python version and OS information will be used to locate a version that's appropriate for the platform. The files will be downloaded. One of these files is the setup.py
script; this will be run automatically to finish the installation.
Installing modules manually
In rare cases, we may have a package that isn't in the PyPI and can't be located by pip
or easy_install
. In this case, we generally have a two- or three-step installation process:
- Download: We need to securely download the package. In many cases, we can use
https
orftps
so that secure sockets are used. In case we can't secure the connection, we may have to check md5 signatures on the files to be sure that our download is complete and unaltered. - Unpack: If the Python packages are compressed into a single ZIP or TAR file, we need to unzip or untar the downloaded file into a temporary directory.
- Set up: Many Python packages designed for manual installation include a
setup.py
file that will do the final installation. We'll need to run a command like this:sudo python3 setup.py install
This sequence of steps, including the final command, is what is automated by pip
and easy_install
. We've shown the Mac OS X and Linux use of the sudo
command to assure that administrator privileges are available. Windows users will simply leave this off.
The setup.py
script uses Python's distutils
package to define what must be installed into the Python library directory structure. The install
option states what we want to do with the package we downloaded. Most of the time, we're going to install, so this is one of the most common options.
In rare exceptions, a package may consist of a single module file. There may not be a setup.py
file. In this case, we will manually copy the file to our own site-packages
directory.