1.5 Collections
Returning to the book analogy we started when discussing libraries, imagine that you own the seven volumes in the Chronicles of Narnia series by C. S. Lewis.
- The Lion, the Witch and the Wardrobe
- Prince Caspian: The Return to Narnia
- The Voyage of the Dawn Treader
- The Silver Chair
- The Horse and His Boy
- The Magician’s Nephew
- The Last Battle
The order I listed them is the order in which Lewis published the volumes from 1950 through 1956. In the list, the first book is the first published, the second is the second published, and so on. Had someone been maintaining the list in the 1950s, they would have appended a new title when Lewis released a new book in the series.
A programming language needs a structure like this so that you can store and insert objects as above. It shouldn’t surprise you that this is called a list. A list is a collection of objects where the order matters.
If we think of physical books, you can see that lists can contain duplicates. I might have two paperback copies of The Lion, the Witch and the Wardrobe, and three of The Horse and His Boy. The list order might be based on when I obtained the books.
Another simple kind of collection is a set. Sets contain no duplicates, and the order is not specified nor guaranteed. We can do the usual mathematical unions and intersections on our sets. If I try to insert an object into a set, it becomes part of the set only if it is not already present.
The third basic form of collection is called a dictionary or hash table. In a dictionary, we connect a key with a value. In our example, the key could be the title and the value might be the book itself. You see this for eBooks on a tablet or phone. You locate a title within your library, press the text on the screen with your finger, and the app displays the text of the book for you to read. A key must be unique: there can be one and only one key with a given name.
Authors of computer science textbooks typically include collections among data structures. With these three collections mentioned above, you can do a surprising amount of coding. Sometimes you need specialized functionality such as an ordered list that contains no duplicates. Most programming libraries give you access to a broad range of such structures and allow you to create your own.
C. S. Lewis wrote many other books and novels. Another dictionary we could create might use decades like “1930s”, “1940s”, “1950s”, and “1960s” as the keys. The values could be the list of titles in chronological order published within each ten-year span.
Collections can be parts of other collections. For example, I can represent a table of company earnings as a list of lists of numbers and strings. However, some languages require the contained objects all to be of the same type. For example, maybe a list needs to be all integers or all strings. Other languages, including Python, allow you to store different kinds of objects.
Exercise 1.4
In the section on functions, we looked at maximum with two parameters. Think about and describe in words how a similar function that took a single parameter, a list of numbers, might work.
An object which we can change is called mutable. The list of books is mutable while an author is writing and publishing. Once the author passes away, we might lock the collection and declare it immutable. No one can make any further changes. For dictionaries, the keys usually need to be immutable and are frequently strings of text or numbers.
Exercise 1.5
Is a string a collection? If so, should it be mutable or immutable?