Built-in Data Types
Everything you do with a computer is managing data. Data comes in many different shapes and flavors. It's the music you listen to, the movies you stream, the PDFs you open. Even the source of the chapter you're reading at this very moment is just a file, which is data.
Data can be simple, an integer number to represent an age, or complex, like an order placed on a website. It can be about a single object or about a collection of them. Data can even be about data, that is, metadata. Data that describes the design of other data structures or data that describes application data or its context. In Python, objects are abstraction for data, and Python has an amazing variety of data...
Everything is an object
Before we delve into the specifics, I want you to be very clear about objects in Python, so let's talk a little bit more about them. As we already said, everything in Python is an object. But what really happens when you type an instruction like age = 42 in a Python module?
So, what happens is that an object is created. It gets an id, the type is set to int (integer number), and the value to 42. A name age is placed in the global namespace, pointing to that object. Therefore, whenever we are in the global namespace, after the execution of that line, we can retrieve that object by simply accessing it through its name...
Mutable or immutable? That is the question
A first fundamental distinction that Python makes on data is about whether or not the value of an object changes. If the value can change, the object is called mutable, while if the value cannot change, the object is called immutable.
It is very important that you understand the distinction between mutable and immutable because it affects the code you write, so here's a question:
>>> age = 42
>>> age
42
>>> age = 43 #A
>>> age
43
In the preceding code, on the line #A, have I changed the value of age? Well, no. But now it's 43 (I hear you say...). Yes, it's 43, but 42 was an integer number, of the type int, which is immutable. So, what happened is really that on the first line, age is a name that is set to point to an int object, whose value is 42. When we type age = 43, what happens is that...
Numbers
Let's start by exploring Python's built-in data types for numbers. Python was designed by a man with a master's degree in mathematics and computer science, so it's only logical that it has amazing support for numbers.
Numbers are immutable objects.
Integers
Python integers have an unlimited range, subject only to the available virtual memory. This means that it doesn't really matter how big a number you want to store is: as long as it can fit in your computer's memory, Python will take care of it. Integer numbers can be positive, negative, and 0 (zero). They support all the basic mathematical operations, as shown in the following example:
>>> a = 14
>>> b = 3
>>>...
Immutable sequences
Let's start with immutable sequences: strings, tuples, and bytes.
Strings and bytes
Textual data in Python is handled with str objects, more commonly known as strings. They are immutable sequences of Unicode code points. Unicode code points can represent a character, but can also have other meanings, such as formatting data, for example. Python, unlike other languages, doesn't have a char type, so a single character is rendered simply by a string of length 1.
Unicode is an excellent way to handle data, and should be used for the internals of any application. When it comes to storing textual data though, or sending it on the network, you may want to encode it, using an appropriate encoding for...
Mutable sequences
Mutable sequences differ from their immutable sisters in that they can be changed after creation. There are two mutable sequence types in Python: lists and byte arrays. I said before that the dictionary is the king of data structures in Python. I guess this makes the list its rightful queen.
Lists
Python lists are mutable sequences. They are very similar to tuples, but they don't have the restrictions of immutability. Lists are commonly used to storing collections of homogeneous objects, but there is nothing preventing you from store heterogeneous collections as well. Lists can be created in many different ways. Let's see an example:
>>> [] # empty list
[]
>>> list() # same as...
Set types
Python also provides two set types, set and frozenset. The set type is mutable, while frozenset is immutable. They are unordered collections of immutable objects. Hashability is a characteristic that allows an object to be used as a set member as well as a key for a dictionary, as we'll see very soon.
An object is hashable if it has a hash value which never changes during its lifetime, and can be compared to other objects. Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally. All of Python’s immutable built-in objects are hashable while mutable containers are not.
Objects that compare equally must have the same hash value. Sets are very commonly used to test for membership, so let's introduce the in operator in the following example:
>...
Mapping types – dictionaries
Of all the built-in Python data types, the dictionary is easily the most interesting one. It's the only standard mapping type, and it is the backbone of every Python object.
A dictionary maps keys to values. Keys need to be hashable objects, while values can be of any arbitrary type. Dictionaries are mutable objects. There are quite a few different ways to create a dictionary, so let me give you a simple example of how to create a dictionary equal to {'A': 1, 'Z': -1} in five different ways:
>>> a = dict(A=1, Z=-1)
>>> b = {'A': 1, 'Z': -1}
>>> c = dict(zip(['A', 'Z'], [1, -1]))
>>> d = dict([('A', 1), ('Z', -1)])
>>> e = dict({'Z': -1, 'A': 1})
>>> a == b == c == d == e # are they all the...
The collections module
When Python general purpose built-in containers (tuple, list, set, and dict) aren't enough, we can find specialized container datatypes in the collections module. They are:
Data type | Description |
namedtuple() |
Factory function for creating tuple subclasses with named fields |
deque |
List-like container with fast appends and pops on either end |
ChainMap |
Dictionary-like class for creating a single view of multiple mappings |
Counter |
Dictionary subclass for counting hashable objects |
OrderedDict |
Dictionary subclass that remembers the order entries were added |
defaultdict |
Dictionary subclass that calls a factory function to supply missing values |
UserDict |
Wrapper around dictionary objects for easier dictionary subclassing |
UserList |
Wrapper around list objects for easier list subclassing |
UserString |
Wrapper around... |
Enums
Technically not a built-in data type, as you have to import them from the enum module, but definitely worth mentioning, are enumerations. They were introduced in Python 3.4, and though it is not that common to see them in professional code (yet), I thought I'd give you an example anyway.
The official definition goes like this: "An enumeration is a set of symbolic names (members) bound to unique, constant values. Within an enumeration, the members can be compared by identity, and the enumeration itself can be iterated over."
Say you need to represent traffic lights. In your code, you might resort to doing this:
>>> GREEN = 1
>>> YELLOW = 2
>>> RED = 4
>>> TRAFFIC_LIGHTS = (GREEN, YELLOW, RED)
>>> # or with a dict
>>> traffic_lights = {'GREEN': 1, 'YELLOW': 2, 'RED': 4}
There&apos...
Final considerations
That's it. Now you have seen a very good proportion of the data structures that you will use in Python. I encourage you to take a dive into the Python documentation and experiment further with each and every data type we've seen in this chapter. It's worth it, believe me. Everything you'll write will be about handling data, so make sure your knowledge about it is rock solid.
Before we leap into Chapter 3, Iterating and Making Decisions, I'd like to share some final considerations about different aspects that to my mind are important and not to be neglected.
Small values caching
When we discussed objects at the beginning of this chapter, we saw that when we assigned a name to an...
Summary
In this chapter, we've explored the built-in data types of Python. We've seen how many there are and how much can be achieved by just using them in different combinations.
We've seen number types, sequences, sets, mappings, collections (and a special guest appearance by Enum), we've seen that everything is an object, we've learned the difference between mutable and immutable, and we've also learned about slicing and indexing (and, proudly, negative indexing as well).
We've presented simple examples, but there's much more that you can learn about this subject, so stick your nose into the official documentation and explore.
Most of all, I encourage you to try out all the exercises by yourself, get your fingers using that code, build some muscle memory, and experiment, experiment, experiment. Learn what happens when you divide by zero...