If you don't know Python, read this section to learn the fundamentals. Python is a very accessible language and, if you have ever programmed, it will only take you a few minutes to learn the basics.
Open a new notebook and type the following in the first cell:
Here is a screenshot:
Tip
Prompt string
Note that the convention chosen in this book is to show Python code (also called the input) prefixed with In [x]:
(which shouldn't be typed). This is the standard IPython prompt. Here, you should just type print("Hello world!")
and then press Shift + Enter.
Congratulations! You are now a Python programmer.
Let's use Python as a calculator.
Here, 2 * 2
is an expression statement. This operation is performed, the result is returned, and IPython displays it in the notebook cell's output.
Tip
Division
In Python 3, 3 / 2 returns 1.5 (floating-point division), whereas it returns 1 in Python 2 (integer division). This can be source of errors when porting Python 2 code to Python 3. It is recommended to always use the explicit 3.0 / 2.0 for floating-point division (by using floating-point numbers) and 3 // 2 for integer division. Both syntaxes work in Python 2 and Python 3. See http://python3porting.com/differences.html#integer-division for more details.
Other built-in mathematical operators include +
, -
, **
for the exponentiation, and others. You will find more details at https://docs.python.org/3/reference/expressions.html#the-power-operator.
Variables form a fundamental concept of any programming language. A variable has a name and a value. Here is how to create a new variable in Python:
And here is how to use an existing variable:
Several variables can be defined at once (this is called unpacking):
There are different types of variables. Here, we have used a number (more precisely, an integer). Other important types include floating-point numbers to represent real numbers, strings to represent text, and booleans to represent True/False values. Here are a few examples:
Note how we used the #
character to write comments. Whereas Python discards the comments completely, adding comments in the code is important when the code is to be read by other humans (including yourself in the future).
String escaping refers to the ability to insert special characters in a string. For example, how can you insert '
and "
, given that these characters are used to delimit a string in Python code? The backslash \
is the go-to escape character in Python (and in many other languages too). Here are a few examples:
The special character \n
is the new line (or line feed) character. To insert a backslash, you need to escape it, which explains why it needs to be doubled as \\
.
You can also disable escaping by using raw literals with a r
prefix before the string, like in the last example above. In this case, backslashes are considered as normal characters.
This is convenient when writing Windows paths, since Windows uses backslash separators instead of forward slashes like on Unix systems. A very common error on Windows is
forgetting to escape backslashes in
paths: writing "C:\path"
may lead to subtle errors.
You will find the list of special characters in Python at https://docs.python.org/3.4/reference/lexical_analysis.html#string-and-bytes-literals.
A list contains a sequence of items. You can concisely instruct Python to perform repeated actions on the elements of a list. Let's first create a list of numbers as follows:
Note the syntax we used to create the list: square brackets []
, and commas , to separate the items.
The built-in function len()
returns the number of elements in a list:
Note
Python comes with a set of built-in functions, including print()
, len()
, max()
, functional routines like filter()
and map()
, and container-related routines like all()
, any()
, range()
, and sorted()
. You will find the full list of built-in functions at https://docs.python.org/3.4/library/functions.html.
Now, let's compute the sum of all elements in the list. Python provides a built-in function for this:
We can also access individual elements in the list, using the following syntax:
Note that indexing starts at 0
in Python: the first element of the list is indexed by 0
, the second by 1
, and so on. Also, -1
refers to the last element, -2
to the penultimate element, and so on.
The same syntax can be used to alter elements in the list:
We can access sublists with the following syntax:
Here, 1:3
represents a slice going from element 1
included (this is the second element of the list) to element 3
excluded. Thus, we get a sublist with the second and third element of the original list. The first-included/last-excluded asymmetry leads to an intuitive treatment of overlaps between consecutive slices. Also, note that a sublist refers to a dynamic view of the original list, not a copy; changing elements in the sublist automatically changes them in the original list.
Python provides several other types of containers:
We can run through all elements of a list using a for
loop:
There are several things to note here:
- The
for item in items
syntax means that a temporary variable named item
is created at every iteration. This variable contains the value of every item in the list, one at a time. - Note the colon
:
at the end of the for
statement. Forgetting it will lead to a syntax error! - The statement
print(item)
will be executed for all items in the list. - Note the four spaces before
print
: this is called the indentation. You will find more details about indentation in the next subsection.
Python supports a concise syntax to perform a given operation on all elements of a list, as follows:
This is called a list
comprehension. A new list is created here; it contains the squares of all numbers in the list. This concise syntax leads to highly readable and Pythonic code.
Indentation refers to the spaces that may appear at the beginning of some lines of code. This is a particular aspect of Python's syntax.
In most programming languages, indentation is optional and is generally used to make the code visually clearer. But in Python, indentation also has a syntactic meaning. Particular indentation rules need to be followed for Python code to be correct.
In general, there are two ways to indent some text: by inserting a tab character (also referred to as \t
), or by inserting a number of spaces (typically, four). It is recommended to use spaces instead of tab characters. Your text editor should be configured such that the Tab key on the keyboard inserts four spaces instead of a tab character.
In the Notebook, indentation is automatically configured properly; so you shouldn't worry about this issue. The question only arises if you use another text editor for your Python code.
Finally, what is the meaning of indentation? In Python, indentation delimits coherent blocks of code, for example, the contents of a loop, a conditional branch, a function, and other objects. Where other languages such as C or JavaScript use curly braces to delimit such blocks, Python uses indentation.
Sometimes, you need to perform different operations on your data depending on some condition. For example, let's display all even numbers in our list:
Again, here are several things to note:
- An
if
statement is followed by a boolean expression. - If a and b are two integers, the modulo operand
a % b
returns the remainder from the division of a by b. Here, item % 2
is 0 for even numbers, and 1 for odd numbers. - The equality is represented by a double equal sign
==
to avoid confusion with the assignment operator =
that we use when we create variables. - Like with the
for
loop, the if
statement ends with a colon :
. - The part of the code that is executed when the condition is satisfied follows the
if
statement. It is indented. Indentation is cumulative: since this if
is inside a for
loop, there are eight spaces before the print(item)
statement.
Python supports a concise syntax to select all elements in a list that satisfy certain properties. Here is how to create a sublist with only even numbers:
This is also a form of list comprehension.
Code is typically organized into functions. A function encapsulates part of your code. Functions allow you to reuse bits of functionality without copy-pasting the code. Here is a function that tells whether an integer number is even or not:
There are several things to note here:
- A function is defined with the
def
keyword. - After
def
comes the function name. A general convention in Python is to only use lowercase characters, and separate words with an underscore _
. A function name generally starts with a verb. - The function name is followed by parentheses, with one or several variable names called the arguments. These are the inputs of the function. There is a single argument here, named
number
. - No type is specified for the argument. This is because Python is dynamically typed; you could pass a variable of any type. This function would work fine with floating point numbers, for example (the modulo operation works with floating point numbers in addition to integers).
- The body of the function is indented (and note the colon
:
at the end of the def
statement). - There is a docstring wrapped by triple quotes
"""
. This is a particular form of comment that explains what the function does. It is not mandatory, but it is strongly recommended to write docstrings for the functions exposed to the user. - The
return
keyword in the body of the function specifies the output of the function. Here, the output is a Boolean, obtained from the expression number % 2 == 0
. It is possible to return several values; just use a comma to separate them (in this case, a tuple of Booleans would be returned).
Once a function is defined, it can be called like this:
Here, 3
and 4
are successively passed as arguments to the function.
Positional and keyword arguments
A Python function can accept an arbitrary number of arguments, called positional
arguments. It can also accept optional named arguments, called
keyword arguments. Here is an example:
The second argument of this function, divisor
, is optional. If it is not provided by the caller, it will default to the number 2
, as shown here:
There are two equivalent ways of specifying a keyword argument when calling a function. They are as follows:
In the first case, 3
is understood as the second argument, divisor
. In the second case, the name of the argument is given explicitly by the caller. This second syntax is clearer and less error-prone than the first one.
Functions can also accept arbitrary sets of positional and keyword arguments, using the following syntax:
Inside the function, args
is a tuple containing positional arguments, and kwargs
is a dictionary containing keyword arguments.
When passing a parameter to a Python function, a
reference to the object is actually passed (passage by assignment):
- If the passed object is mutable, it can be modified by the function
- If the passed object is immutable, it cannot be modified by the function
Here is an example:
The add()
function modifies an object defined outside it (in this case, the object my_list
); we say this function has side-effects. A function with no side-effects is called a pure function: it doesn't modify anything in the outer context, and it deterministically returns the same result for any given set of inputs. Pure functions are to be preferred over functions with side-effects.
Knowing this can help you spot out subtle bugs. There are further related concepts that are useful to know, including function scopes, naming, binding, and more. Here are a couple of links:
Let's talk about errors in Python. As you learn, you will inevitably come across errors and exceptions. The Python interpreter will most of the time tell you what the problem is, and where it occurred. It is important to understand the vocabulary used by Python so that you can more quickly find and correct your errors.
Let's see the following example:
Here, we defined a divide()
function, and called it to divide 1
by 0
. Dividing a number by 0 is an error in Python. Here, a ZeroDivisionError
exception was raised. An exception is a particular type of error that can be raised at any point in a program. It is propagated from the innards of the code up to the command that launched the code. It can be caught and processed at any point. You will find more details about exceptions at https://docs.python.org/3/tutorial/errors.html, and common exception types at https://docs.python.org/3/library/exceptions.html#bltin-exceptions.
The error message you see contains the stack trace, the exception type, and the exception message. The stack trace shows all function calls between the raised exception and the script calling point.
The top frame, indicated by the first arrow ---->
, shows the entry point of the code execution. Here, it is divide(1, 0)
, which was called directly in the Notebook. The error occurred while this function was called.
The next and last frame is indicated by the second arrow. It corresponds to line 2 in our function divide(a, b)
. It is the last frame in the stack trace: this means that the error occurred there.
We will see later in this chapter how to debug such errors interactively in IPython and in the Jupyter Notebook. Knowing how to navigate up and down in the stack trace is critical when debugging complex Python code.
Object-oriented programming
Object-oriented programming (OOP) is a relatively advanced topic. Although we won't use it much in this book, it is useful to know the basics. Also, mastering OOP is often essential when you start to have a large code base.
In Python, everything is an object. A number, a string, or a function is an object. An object is an instance of a type (also known as class). An object has attributes and methods, as specified by its type. An attribute is a variable bound to an object, giving some information about it. A method is a function that applies to the object.
For example, the object 'hello'
is an instance of the built-in str
type (string). The type()
function returns the type of an object, as shown here:
There are native types, like str
or int
(integer), and custom types, also called classes, that can be created by the user.
In IPython, you can discover the attributes and methods of any object with the dot syntax and tab completion. For example, typing 'hello'.u
and pressing Tab automatically shows us the existence of the upper()
method:
Here, upper()
is a method available to all str
objects; it returns an uppercase copy of a string.
A useful string method is format()
. This simple and convenient templating system lets you generate strings dynamically, as shown in the following example:
The {0:s}
syntax means "replace this with the first argument of format()
, which should be a string". The variable type after the colon is especially useful for numbers, where you can specify how to display the number (for example, .3f
to display three decimals). The 0
makes it possible to replace a given value several times in a given string. You can also use a name instead of a position—for example 'Hello {name}!'.format(name='Python')
.
Some methods are prefixed with an underscore _
; they are private and are generally not meant to be used directly. IPython's tab completion won't show you these private attributes and methods unless you explicitly type _
before pressing Tab.
In practice, the most important thing to remember is that appending a dot .
to any Python object and pressing Tab in IPython will show you a lot of functionality pertaining to that object.
Python is a multi-paradigm language; it notably supports imperative, object-oriented, and functional programming models. Python functions are objects and can be handled like other objects. In particular, they can be passed as arguments to other functions (also called higher-order functions). This is the essence of
functional programming.
Decorators provide a convenient syntax construct to define higher-order functions. Here is an example using the is_even()
function from the previous Functions section:
The show_output()
function transforms an arbitrary function func()
to a new function, named wrapped()
, that displays the result of the function, as follows:
Equivalently, this higher-order function can also be used with a decorator, as follows:
You can find more information about Python decorators at https://en.wikipedia.org/wiki/Python_syntax_and_semantics#Decorators and at http://www.thecodeship.com/patterns/guide-to-python-function-decorators/.
Let's finish this section with a few notes about Python 2 and Python 3 compatibility issues.
There are still some Python 2 code and libraries that are not compatible with Python 3. Therefore, it is sometimes useful to be aware of the differences between the two versions. One of the most obvious differences is that print
is a statement in Python 2, whereas it is a function in Python 3. Therefore, print "Hello"
(without parentheses) works in Python 2 but not in Python 3, while print("Hello")
works in both Python 2 and Python 3.
There are several non-mutually exclusive options to write portable code that works with both versions:
- futures: A built-in module supporting backward-incompatible Python syntax
- 2to3: A built-in Python module to port Python 2 code to Python 3
- six: An external lightweight library for writing compatible code
Here are a few references:
You now know the fundamentals of Python, the bare minimum that you will need in this book. As you can imagine, there is much more to say about Python.
Following are a few further basic concepts that are often useful and that we cannot cover here, unfortunately. You are highly encouraged to have a look at them in the references given at the end of this section:
range
and enumerate
pass
, break
, and, continue
, to be used in loops- Working with files
- Creating and importing modules
- The Python standard library provides a wide range of functionality (OS, network, file systems, compression, mathematics, and more)
Here are some slightly more advanced concepts that you might find useful if you want to strengthen your Python skills:
- Regular expressions for advanced string processing
- Lambda functions for defining small anonymous functions
- Generators for controlling custom loops
- Exceptions for handling errors
with
statements for safely handling contexts- Advanced object-oriented programming
- Metaprogramming for modifying Python code dynamically
- The
pickle
module for persisting Python objects on disk and exchanging them across a network
Finally, here are a few references: