# The Dictionary Python Object Type

In this third part of a four-part series on Python object types, we will wrap up our discussion of lists and introduce you to some remarkable things you can do with dictionaries. This article is excerpted from chapter four of the book Learning Python, Third Edition, written by Mark Lutz (O’Reilly, 2008; ISBN: 0596513984). Copyright © 2008 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.

List Comprehensions

In addition to sequence operations and list methods, Python includes a more advanced operation known as a list comprehension expression, which turns out to be a powerful way to process structures like our matrix. Suppose, for instance, that we need to extract the second column of our sample matrix. It’s easy to grab rows by simple indexing because the matrix is stored by rows, but it’s almost as easy to get a column with a list comprehension:

>>> col2 = [row[1] for row in M]     # Collect the items in column 2

>>> col2
[2, 5, 8]
>>> M                               # The matrix is unchanged
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

List comprehensions derive from set notation; they are a way to build a new list by running an expression on each item in a sequence, one at a time, from left to right. List comprehensions are coded in square brackets (to tip you off to the fact that they make a list), and are composed of an expression and a looping construct that share a variable name ( row , here). The preceding list comprehension means basically what it says: “Give me row[1] for each row in matrix M , in a new list.” The result is a new list containing column 2 of the matrix.

List comprehensions can be more complex in practice:

>>> [row[1] + 1 for row in M]       # Add 1 to each item in column 2
[3, 6, 9 ]

>>> [row[1] for row in M if
row[1] % 2 == 0]
# Filter out odd item s

[2, 8 ]

The first operation here, for instance, adds 1 to each item as it is collected, and the second uses an if clause to filter odd numbers out of the result using the % modulus expression (remainder of division). List comprehensions make new lists of results, but can be used to iterate over any iterable object—here, for instance, we’ll use list comprehensions to step over a hardcoded list of coordinates, and a string:

>>> diag = [M[i][i] for i
in [0, 1, 2]]
# Collect a diagonal from matrix
>>> diag
[1, 5, 9]
>>> doubles = [c * 2 for c
in 'spam']
# Repeat characters in a string
>>> doubles
['ss', 'pp', 'aa', 'mm']

List comprehensions are a bit too involved for me to say more about them here. The main point of this brief introduction is to illustrate that Python includes both simple and advanced tools in its arsenal. List comprehensions are an optional feature, but they tend to be handy in practice, and often provide a substantial processing speed advantage. They also work on any type that is a sequence in Python, as well as some types that are not. You’ll hear more about them later in this book.

{mospagebreak title=Dictionaries}

Python dictionaries are something completely different (Monty Python reference intended)—they are not sequences at all, but are instead known as mappings. Mappings are also collections of other objects, but they store objects by key instead of by relative position. In fact, mappings don’t maintain any reliable left-to-right order; they simply map keys to associated values. Dictionaries, the only mapping type in Python’s core objects set, are also mutable: they may be changed in-place, and can grow and shrink on demand, like lists.

Mapping Operations

When written as literals, dictionaries are coded in curly braces, and consist of a series of “key: value” pairs. Dictionaries are useful anytime we need to associate a set of values with keys—to describe the properties of something, for instance. As an example, consider the following three-item dictionary (with keys “food,” “quantity,” and “color”):

>>> D = {‘food’: ‘Spam’, ‘quantity’: 4, ‘color’: ‘pink’}

We can index this dictionary by key to fetch and change the keys’ associated values. The dictionary index operation uses the same syntax as that used for sequences, but the item in the square brackets is a key, not a relative position:

>>> D['food']               # Fetch value of key ‘food

‘Spam ‘

>>> D['quantity'] += 1      # Add 1 to ‘quantity’ valu e

>>> D
{‘food’: ‘Spam’, ‘color’: ‘pink’, ‘quantity’: 5}

Although the curly-braces literal form does see use, it is perhaps more common to see dictionaries built up in different ways. The following, for example, starts with an empty dictionary, and fills it out one key at a time. Unlike out-of-bounds assignments in lists, which are forbidden, an assignment to new dictionary key creates that key:

>>> D = {}
>>> D['name'] = ‘Bob’             # Create keys by assignment

>>>
D['job'] = ‘dev’

>>>
D['age'] = 40

>>> D

{‘age’: 40, ‘job’: ‘dev’, ‘name’: ‘Bob’}

>>> print D['name']
Bob

Here, we’re effectively using dictionary keys as field names in a record that describes someone. In other applications, dictionaries can also be used to replace searching operations—indexing a dictionary by key is often the fastest way to code a search in Python.

{mospagebreak title=Nesting Revisited}

In the prior example, we used a dictionary to describe a hypothetical person, with three keys. Suppose, though, that the information is more complex. Perhaps we need to record a first name and a last name, along with multiple job titles. This leads to another application of Python’s object nesting in action. The following dictionary, coded all at once as a literal, captures more structured information:

>>> rec = {‘name’: {‘first’: ‘Bob’, ‘last’: ‘Smith’},

‘job’: ['dev', 'mgr'],
‘age’: 40.5}

Here, we again have a three-key dictionary at the top (keys “name,” “job,” and “age”), but the values have become more complex: a nested dictionary for the name to support multiple parts, and a nested list for the job to support multiple roles and future expansion. We can access the components of this structure much as we did for our matrix earlier, but this time some of our indexes are dictionary keys, not list offsets:

>>> rec['name']                  # ‘Name’ is a nested dictionar y

{‘last’: ‘Smith’, ‘first’: ‘Bob’ }

>>> rec['name']['last']          # Index the nested dictionar y

‘Smith ‘

>>> rec['job']                   # ‘Job’ is a nested lis t

['dev', 'mgr' ]

>>> rec['job'][-1]               # Index the nested lis t

‘mgr ‘

>>> rec['job'].append(‘janitor’) # Expand Bob’s job description in-place

>>> rec

{‘age’: 40.5, ‘job’: ['dev', 'mgr', 'janitor'], ‘name’: {‘last’: ‘Smith’, ‘first’:
‘Bob’}}

Notice how the last operation here expands the nested job list—because the job list is a separate piece of memory from the dictionary that contains it, it can grow and shrink freely (object memory layout will be discussed further later in this book).

The real reason for showing you this example is to demonstrate the flexibility of Python’s core data types. As you can see, nesting allows us to build up complex information structures directly and easily. Building a similar structure in a low-level language like C would be tedious and require much more code: we would have to lay out and declare structures and arrays, fill out values, link everything together, and so on. In Python, this is all automatic—running the expression creates the entire nested object structure for us. In fact, this is one of the main benefits of scripting languages like Python.

Just as importantly, in a lower-level language, we would have to be careful to clean up all of the object’s space when we no longer need it. In Python, when we lose the last reference to object—by assigning its variable to something else, for example—all of the memory space occupied by that object’s structure is automatically cleaned up for us:

>>> rec = 0                # Now the object’s space is reclaimed

Technically speaking, Python has a feature known as garbage collection that cleans up unused memory as your program runs and frees you from having to manage such details in your code. In Python, the space is reclaimed immediately, as soon as the last reference to an object is removed. We’ll study how this works later in this book; for now, it’s enough to know that you can use objects freely, without worrying about creating their space or cleaning up as you go.*

{mospagebreak title=Sorting Keys: for Loops}

As mappings, as we’ve already seen, dictionaries only support accessing items by key. However, they also support type-specific operations with method calls that are useful in a variety of common use cases.

As mentioned earlier, because dictionaries are not sequences, they don’t maintain any dependable left-to-right order. This means that if we make a dictionary, and print it back, its keys may come back in a different order than how we typed them:

>>> D = {‘a’: 1, ‘b’: 2, ‘c’: 3}

>>> D

{‘a’: 1, ‘c’: 3, ‘b’: 2}

What do we do, though, if we do need to impose an ordering on a dictionary’s items? One common solution is to grab a list of keys with the dictionary keys method, sort that with the list sort method, and then step through the result with a Python for loop:

>>> Ks = D.keys()            # Unordered keys list

>>> Ks

['a', 'c', 'b']

>>> Ks.sort()                # Sorted keys list
>>> Ks
['a', 'b', 'c']

>>> for key in Ks:          # Iterate though sorted keys

print key,
‘=>’, D[key]

a => 1
b => 2
c => 3

This is a three-step process, though, as we’ll see in later chapters, in recent versions of Python it can be done in one step with the newer sorted built-in function ( sorted returns the result and sorts a variety of object types):

>>> D
{‘a’: 1, ‘c’: 3, ‘b’: 2}

>>> for key in sorted(D):
print key, ‘=>’, D[key]

a => 1
b => 2
c => 3

This case serves as an excuse to introduce the Python for loop. The for loop is a sim ple and efficient way to step through all the items in a sequence and run a block of code for each item in turn. A user-defined loop variable ( key , here) is used to reference the current item each time through. The net effect in our example is to print the unordered dictionary’s keys and values, in sorted-key order.

The for loop, and its more general cousin the while loop, are the main ways we code repetitive tasks as statements in our scripts. Really, though, the for loop, like its relative the list comprehension (which we met earlier) is a sequence operation. It works on any object that is a sequence and, also like the list comprehension, even on some things that are not. Here, for example, it is stepping across the characters in a string, printing the uppercase version of each as it goes:

>>> for c in ‘spam’:
print c.upper()

S
P
A
M

We’ll discuss looping statements further later in the book.

Iteration and Optimization

If the for loop looks like the list comprehension expression introduced earlier, it should: both are really general iteration tools. In fact, both will work on any object that follows the iteration protocol—an idea introduced recently in Python that essentially means a physically stored sequence in memory, or an object that generates one item at a time in the context of an iteration operation. This is why the sorted call used in the prior section works on the dictionary directly—we don’t have to call the keys method to get a sequence because dictionaries are iterable objects.

I’ll have more to say about the iteration protocol later in this book. For now, keep in mind that any list comprehension expression, such as this one, which computes the squares of a list of numbers:

>>> squares = [x ** 2 for x in [1, 2, 3, 4, 5]]
>>> squares
[1, 4, 9, 16, 25]

can always be coded as an equivalent for loop that builds the result list manually by appending as it goes:

>>> squares = []
>>> for x in [1, 2, 3,
4, 5]:
# This is what a list comp does

squares.append(x ** 2)

>>> squares
[1, 4, 9, 16, 25]

The list comprehension, though, will generally run faster (perhaps even twice as fast)—a property that could matter in your programs for large data sets. Having said that, though, I should point out that performance measures are tricky business in Python because it optimizes so much, and can vary from release to release.

A major rule of thumb in Python is to code for simplicity and readability first, and worry about performance later, after your program is working, and after you’ve proved that there is a genuine performance concern. More often than not, your code will be quick enough as it is. If you do need to tweak code for performance, though, Python includes tools to help you out, including the time and timeit modules and the profile module. You’ll find more on these later in this book, and in the Python manuals.