Python: Built different
Guido Van Rossum, the creator of the Python programming language, was frustrated with the state of computer programming in the late 1980s. Programming languages were too complex and, at the same time, too loose with their formatting requirements. This led to large codebases with complex scripts poorly written and rarely documented.
Merely running a simple program could take a long time, as the code would need to be type-checked (variables declared correctly and assigned to the correct data type) and compiled (converted from high-level code written in text files into the assembly language or machine code understood by the CPU).
As this Dutch programmer had completed professional work on the ABC programming language, where he had learned much about language design, he decided he wanted to turn his gripes about the limits of ABC and other languages into a hobby.
With a master’s degree in mathematics and computer science from the University of Amsterdam, his hobbies tended towards the computer, but he did have a love for Monty Python, the British comedy series. So, he combined his passions and created Python, which is now used for all kinds of programmatic solutions. Today Python is everywhere, used to power the internet, kitchen appliances, cars, and so much more. Because of its ubiquity and its simplicity, it has been adopted by the GIS software ecosystem as a standard programming tool.
Thanks to Van Rossum’s extensive experience with the state of computer languages in the 1980s, he was well positioned to create a language that solved many of their deficiencies. He added features that he admired from many other languages and added a few of his own. Here is an incomplete list of Python features built to improve on other languages:
Issue |
Improvement |
Python feature |
Memory overrun |
Built-in memory management |
Garbage collection and memory management |
Slow compiler times |
One-line testing, dynamic typing |
Python interpreter |
Unclear error messages |
Messages indicating the offending line and affected code |
Error traceback |
Spaghetti code, i.e. code with unclear internal logic |
Clean importation and modularization |
Importation |
Unclear code formatting and spacing, making code unreadable |
Indentation rules and reduced brackets |
Forced whitespace |
Too many ways to do something |
There should be only one way: the Pythonic way |
The Zen of Python, a philosophy of programming that is unique to Python, which expects clean and simple implementations. Type |
Python versions
The original Python version released in 1991 by Van Rossum, Python 1.0 and its successors, was eventually superseded by the widely popular Python 2.x. Care was taken to ensure that version 2.0 and beyond were backward-compatible with Python 1.x. However, for the new Python 3.0 and beyond, backward compatibility with Python 1 and Python 2 was broken.
This break has caused a divergence in the Python ecosystem. Some companies chose to stick with Python 2.x, which meant that the “sunset” date, or retirement date, for the older version was extended from 2015 until April 2020. Now that the sunset date has passed, there is no active work by the Python Software Foundation (PSF) on Python 2.x. Python 3.x development continues and will continue into the future, overseen by the PSF.
Van Rossum served as the Benevolent Dictator for Life of the PSF until he resigned from the position in 2018.
Check out more about the history of Python here: https://docs.python.org/3/faq/general.html
ArcGIS Python versions
Since ArcMap version 9.x, Python has been integrated into the ArcGIS software suite. However, ArcGIS Desktop and ArcGIS Pro now both depend on different versions of Python:
- ArcGIS Pro: Python 3.x
ArcGIS Pro, which was designed after the decision to sunset Python 2.0 was announced, was divorced from the Python 2.x ecosystem and instead ships with Python 3.x.
Along with the
arcpy
module, ArcGIS Pro uses thearcgis
module, known as the ArcGIS API for Python.
- ArcGIS Desktop: Python 2.x
ArcGIS Desktop (or ArcMap) version 9.0 and above ships with Python 2.x included. The installer for ArcGIS will automatically install Python 2.x and will add the
arcpy
module (originallyarcgisscripting
) to the Python system path variable, making it available for scripting.ArcMap, ArcCatalog, ArcGIS Engine, and ArcGIS Server all depend on
arcpy
and the Python 2.x version included when the ArcGIS Desktop or Enterprise software is installed.
The sunsetting of ArcGIS Desktop has been extended to March 2025, meaning that Python 2.7 will be included by Esri until that time, despite it being officially retired by the Python Software Foundation. With the sunsetting of ArcGIS Desktop approaching, users are now writing scripts in Python 3 to work with ArcGIS Pro.
What is Python?
In short, Python is an application: python.exe
. This application is an executable file, meaning it can be run to process lines of code, or it can be called from other applications to run custom scripts. When ArcGIS Pro is installed, Python is also installed on your computer, along with a series of supporting files and folders, at this default location:
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3
Python includes a large standard library of tools, or modules. These include support for internet requests, advanced math, CSV reading and writing, JSON serialization, and many more modules included in the Python core. While these tools are powerful, Python was also built to be extensible, meaning that third-party modules can be easily added to a Python installation.
The ArcGIS Python modules, arcpy
and arcgis
, are both good examples of extending the capabilities of Python. There are hundreds of thousands of others, covering almost any type of programming need, of varying quality.
Python is written in the programming language C. There are variants of Python written in other languages for a variety of technical reasons, but most implementations of Python are built on top of C. This means that Python is often expanded through modules built on top of C code, usually for speed improvement reasons.
A Python code layer or wrapper is put on top of C code to make it work with normal Python packages, gaining the simplicity of Python and the processing speed boosts of precompiled C code. NumPy and SciPy (which are included with the ArcGIS installation of Python) are examples of this type of module.
Python is free and open software, which is another reason it is packaged with so many other software applications for automation purposes. While Python is already installed with ArcGIS Pro, it can also be installed separately, using a free installer from the Python Software Foundation.
Check out the Python Software Foundation on the internet: https://www.python.org/psf
Download Python versions directly from the PSF: https://www.python.org/downloads/
Where is it installed?
On Windows machines, Python is not included by default; it will be installed along with ArcGIS Pro or separately using an installer from the Python Software Foundation.
Once the ArcGIS Installer is run, a few versions of Python will be installed. For our use in this book, the main version is the Python 3 virtual environment installed at this folder location:
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3
Figure 1.1: Structure of the Python folder, containing the python.exe executable
Python interpreter
When you run python.exe
(see below for multiple ways to run the executable), it starts what is known as the Python interpreter.
This is a useful interface, allowing you to enter, one line at a time, bits of code for testing and confirmation. Once the line is entered, hit Enter/Return and the code will be executed. This tool helps you both to learn to code and to test code in the same environment.
Double-clicking on python.exe
from the folder or starting Python (command line) from the Start menu will start the interpreter, which allows for one-line commands to be executed:
Figure 1.2: Python interpreter for Python 3.7
What is a Python script?
The python.exe
executable file, along with being a program where code can be run, will also execute Python scripts. These scripts are simple text files that can be edited by any text editing software. Python scripts are saved with the .py
extension.
When a Python script is run, it is passed as the first command-line argument to the Python executable (python.exe
). This program will read and then execute the code from the top to the bottom, as long as it is valid Python and it contains no errors. If there is an error encountered, the script will stop and return an error message. If there is no error, nothing will be returned unless you have added “print” statements to return messages from the main loop to the Python window as the script is running.
Executables included
Python comes with two versions of the python.exe
file. These are the same version of Python, to be clear, but each file has a different role. python.exe
is the main file, and the other version is pythonw.exe
. This file will not open an interpreter if double-clicked, as the normal python.exe
will. No interpreter is available from pythonw.exe
, which is the point: it is used to execute scripts more “silently” than python.exe
(for example, when called by another application such as ArcGIS to run a Python script).
Use python.exe
to start the interpreter.
Figure 1.3: pythonw.exe in the Python folder
How to call the executable
The Python executable (python.exe
) is accessed to run the Python interpreter or to run a custom Python script. There are many different ways to call or start the Python executable:
- Double-click on
python.exe
("C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\python.exe"
): This starts the Python interpreter. - Run Python inside ArcGIS Pro: ArcGIS Pro has a built-in Python interpreter that you will use in Chapter 2 to run custom lines of code. In Chapter 3, you will see how to use ArcGIS Pro Notebooks as a way to test, store, and share custom scripts as Notebooks.
- Open IDLE, the included integrated development environment (IDE): It can be run directly:
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Scripts\idle.exe
In Chapter 2, you will see how to create a shortcut on your Desktop to the IDLE associated with your Python 3.x install.
Figure 1.4: Python applications available through the Start/ArcGIS Menu
If you have ArcGIS Desktop and ArcGIS Pro along with other versions of Python installed, always pay attention to which version of Python you are opening from the Start menu. Not all versions may be associated with ArcGIS and therefore may not have the
arcpy
module accessible.
- Open a CMD terminal and type
python
: This only works if the Python executable is in the Windows PATH environment variable. If you get an error that says'python' is not recognized as an internal or external command, operable program or batch file
, thepython.exe
program is not in the Windows PATH environment variable.Check out this blog for a discussion on how to add your executable to the Path variable: https://www.educative.io/edpresso/how-to-add-python-to-path-variable-in-windows
- Use a third-party IDE such as PyCharm: Each PyCharm project can have its own virtual environment, and therefore its own executable, or it can use the one installed by Esri when ArcGIS is installed (
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\python
). There are a lot of IDEs, but PyCharm is the one we recommend for a variety of reasons: clean interface, easy downloading of modules, built-in virtual environments, and more. - Use a Jupyter Notebook: This requires the installation of Jupyter, which is not included in the standard Python installation.
You will be using ArcGIS Pro Notebooks starting in Chapter 3. These are based on Jupyter Notebooks and are very similar, but are stored and run in ArcGIS Pro.
- Run Python in the command line by using the whole path to the executable:
"C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\python.exe"
There are multiple ways to directly run the script using the executable, but we find that IDEs make it easier to edit and execute code.
IDLE development environment
The included IDE, called IDLE, is a useful environment that comes standard with every Python instance:
Figure 1.5: The Python IDLE interpreter environment is similar to a shell environment. Code can be run one line at a time.
You can create and execute scripts in this environment easily by opening a new script from the File menu, and then using the script’s Run menu to execute the script:
Figure 1.6: Running a script in IDLE
Windows file path issues
Because Python was developed in a Unix/Linux environment, it expects file paths to use forward slashes (/
). However, Windows uses backslashes (\
) in its file paths.
Windows:
'C:\Python\python.exe'
Linux:
'C:/Python/python.exe'
This has consequences in a Python script, because of the presence of a number of special string combinations made with backslashes. For instance, to create a tab character in a string, Python uses a combination of a backslash and a “t” to create this character: \t
.
The backslashes can be escaped; in other words, Python can be told to ignore the special characters in a string, by doubling up the backslash. However, this is inconvenient. The easiest way to address the backslashes inherent in Windows file paths (when passing a shapefile file path to an arcpy
function, for instance) is to make them into raw strings by putting an “r” in front of the string.
The following would cause an error when passed to an arcpy
function, because of all the \t
characters:
'C:\test\test.shp'
To avoid this, you have three options. If you are copying a folder path from Windows Explorer, use an “r” in front of the script to transform it into a raw string:
r'C:\test\test.shp'
You can also use forward slashes:
'C:/test/test.shp'
Escaping the backslashes by doubling them up also works:
'C:\\test\\test.shp'
The operating system and Python system modules
Two important modules or code libraries built into Python to know about are the os
and sys
modules. The first, os
, is also called the operating system module. The second, sys
, is the Python system module. They are used to control Windows system operations and Python system operations respectively.
The os module
The os
module is used for many things, including folder path operations such as creating folders, removing folders, checking if a folder or file exists, or executing a file using the operating system-associated application used to run that file extension. Getting the current directory, copying files, and more, are made possible with this module. The os
module will be used throughout this book in examples to do all of the above.
In the following code snippet, we first import the os
module since we intend to use it. A string, "C:\Test_folder"
, is passed to the os.path.exists
method, which returns a Boolean value (either True
or False
). If it returns False
, the folder does not exist, and is then created using the os.mkdir
method:
import os
folderpath = r"C:\Test_folder"
if not os.path.exists(folderpath):
os.mkdir(folderpath)
Read about the os
module here: https://www.geeksforgeeks.org/os-module-python-examples/
The sys module
The sys
module, among other functions, allows you to accept arguments to a script at runtime (meaning when the script is executed). This is done by using the sys.argv
method, which is a list containing all arguments made to Python during the executing of the script.
If a name
variable is using the sys
module to accept parameters, here is what the script looks like:
import sys
name = sys.argv[1]
print(name)
Note again that the sys.argv
method is a list, and the second element in the list (assigned to the variable name
above) is the first parameter passed. Python uses zero-based indexing, which we explore in further detail later in the chapter. The first element in the list is the file path of the script being run.
The system path
The sys
module contains the Python path or system path (system in this case means Python). The Python system path, available from the sys
module at sys.path
, is a list that Python uses to search for importable modules, after accessing the Windows Path variable. If you can’t edit the Windows Path (due to permissions, usually), you can alter the Python path at runtime using the system path.
The sys.path
list is a part of the sys
module built into Python:
Figure 1.7: Inspecting the sys.path list
Read more about the sys
module here: https://www.geeksforgeeks.org/python-sys-module/
We have given you a lot of information about what Python is, how the Python folder is structured, how the Python executable is run, and how to execute and run scripts. This will help you run Python scripts to automate your analyses. In the next section, we will be zooming out to gain a wider view of computer programming.
This will help you to gain more insight into why Python was chosen to be the language of automation for ArcGIS Pro, and help you to be a better programmer in general.
As well as an introduction to Python programming, the rest of the chapter will be a useful reference for you to come back to as you work through the book. If you’d like to get hands-on with writing code straightaway, start with Chapter 2, Basics of ArcPy.