Tuples and Other Python Object Types

In this conclusion to a four-part article series on Python object types, we will finish our discussion of dictionaries, move on to tuples, and cover related material. This article is excerpted from chapter four of the book Learning Python, Third Edition, written by Mark Lutz (O’Reilly, 2008; ISBN: 0596513984). Copyright © 2008 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.

Missing Keys: if Tests

One other note about dictionaries before we move on. Although we can assign to a new key to expand a dictionary, fetching a nonexistent key is still a mistake:

  >>> D
  {‘a': 1, ‘c': 3, ‘b': 2}

  >>> D['e'] = 99                 # Assigning new keys grows dictionaries
  >>> D
  {‘a': 1, ‘c': 3, ‘b': 2,
‘e': 99}

  >>> D['f']                      # Referencing one is an error
  …error text omitted…
  KeyError: ‘f’

This is what we want—it’s usually a programming error to fetch something that isn’t really there. But, in some generic programs, we can’t always know what keys will be present when we write our code. How do we handle such cases and avoid the errors? One trick here is to test ahead of time. The dictionary has_key method allows us to query the existence of a key and branch on the result with a Python if statement:

  >>> D.has_key(‘f’)
  False

  >>> if not D.has_key(‘f’):
         print ‘missing’

  missing

I’ll have much more to say about the if statement and statement syntax in general later in this book, but the form we’re using here is straightforward: it consists of the word if, followed by an expression that is interpreted as a true or false result, followed by a block of code to run if the test is true. In its full form, the if statement can also have an else clause for a default case, and one or more elif (else if) clauses for other tests. It’s the main selection tool in Python, and it’s the way we code logic in our scripts.

There are other ways to create dictionaries and avoid accessing a nonexistent dictionary key (including the get method; the in membership expression; and the try statement, a tool we’ll first meet in Chapter 10 that catches and recovers from exceptions altogether), but we’ll save the details on those until a later chapter. Now, let’s move on to tuples.

{mospagebreak title=Tuples} 

The tuple object (pronounced “toople” or “tuhple,” depending on who you ask) is roughly like a list that cannot be changed—tuples are sequences, like lists, but they are immutable, like strings. Syntactically, they are coded in parentheses instead of square brackets, and they support arbitrary types, nesting, and the usual sequence operations:

  >>> T = (1, 2, 3, 4)            # A 4-item tuple
  >>> len(T)                       # Length
  4

  >> T + (5, 6)                   # Concatenation
  (1, 2, 3, 4, 5, 6)

  >>> T[0]                        # Indexing, slicing, and more
  1

The only real distinction for tuples is that they cannot be changed once created. That is, they are immutable sequences:

  >>> T[0] = 2                 # Tuples are immutable
 
…error text omitted…
  TypeError: ‘tuple’ object does not support item assignment

Why Tuples?

So, why have a type that is like a list, but supports fewer operations? Frankly, tuples are not generally used as often as lists in practice, but their immutability is the whole point. If you pass a collection of objects around your program as a list, it can be changed anywhere; if you use a tuple, it cannot. That is, tuples provide a sort of integrity constraint that is convenient in programs larger than those we can write here. We’ll talk more about tuples later in the book. For now, though, let’s jump ahead to our last major core type, the file.

Files

File objects are Python code’s main interface to external files on your computer. They are a core type, but they’re something of an oddball—there is no specific literal syntax for creating them. Rather, to create a file object, you call the built-in open function, passing in an external filename as a string, and a processing mode string. For example, to create an output file, you would pass in its name and the ‘w’ processing mode string to write data:

  >>> f = open(‘data.txt’,
‘w’)
                                 # Make a new file in output mode 
  >>> f.write(‘Hellon’)          # Write strings of bytes to it 
  >>> f.write(‘worldn’)
 
>>> f.close()                  #  Close to flush output buffers to disk 

This creates a file in the current directory, and writes text to it (the filename can be a full directory path if you need to access a file elsewhere on your computer). To read back what you just wrote, reopen the file in ‘r’ processing mode, for reading input (this is the default if you omit the mode in the call). Then read the file’s content into a string of bytes, and display it. A file’s contents are always a string of bytes to your script, regardless of the type of data the file contains:

  >>> f = open(‘data.txt’)             # ‘r’ is the default processing mode
  >>> bytes = f.read()               # Read entire file into a string
  >>> bytes
  ‘Hellonworldn’

  >>> print bytes                    # Print interprets control character s
 
Hell o
  world

  >>> bytes.split()                    # File content is always a strin g
 
['Hello', 'world' ]

Other file object methods support additional features we don’t have time to cover here. For instance, file objects provide more ways of reading and writing ( read accepts an optional byte size, readline reads one line at a time, and so on), as well as other tools ( seek moves to a new file position). We’ll meet the full set of file meth ods later in this book, but if you want a quick preview now, run a dir call on the word file (the name of the file data type), and a help on any of the names that come back:

  >>> dir(file)
 
['__class__', '__delattr__', '__doc__', '__enter__', '__exit__',
  '__getattribute__', '__hash__', '__init__', '__iter__', '__new__',
  '__reduce__', '__reduce_ex__', '__repr__',  '__setattr__', '__str__',
  'close', 'closed', 'encoding', 'fileno', 'flush', 'isatty', 'mode',
  'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines',
  'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']

  >>> help(file.seek)
 
…try it and see…

Other File-Like Tools

The open function is the workhorse for most file processing you will do in Python. For more advanced tasks, though, Python comes with additional file-like tools: pipes, fifos, sockets, keyed-access files, object persistence, descriptor-based files, relational and object-oriented database interfaces, and more. Descriptor files, for instance, support file locking and other low-level tools, and sockets provide an interface for networking and interprocess communication. We won’t cover many of these topics in this book, but you’ll find them useful once you start programming Python in earnest.

{mospagebreak title=Other Core Types}

Beyond the core types we’ve seen so far, there are others that may or may not qualify for membership, depending on how broad the category is defined to be. Sets, for example, are a recent addition to the language. Sets are containers of other objects created by calling the built-in set function, and they support the usual mathematical set operations:

  >>> X = set(‘spam’)
 
>>> Y = set(['h',
'a', 'm'])                    
# Make 2 sets out of sequences
 
>>> X, Y
 
(set(['a', 'p', 's', 'm']), set(['a', 'h', 'm']))

  >>> X & Y                       # Intersection
  set(['a', 'm'])

  >>> X | Y                       # Union
  set(['a', 'p', 's', 'h', 'm'])

  >>> X – Y                       # Difference
  set(['p', 's'])

In addition, Python recently added decimal numbers (fixed-precision floating-point numbers) and Booleans (with predefined True and False objects that are essentially just the integers 1 and 0 with custom display logic), and it has long supported a spe cial placeholder object called None :

  >>> import decimal              # Decimals
  >>> d =
decimal.Decimal(‘3.141′)
 
>>> d + 1
 
Decimal("4.141")

  >>> 1 > 2, 1 < 2                # Boolean s 
  (False, True )
  >>> bool(‘spam’)
 
True

  >>> X = None                  # None placeholder
  >>> print X 
 
None
  >>> L = [None] * 100          # Initialize a list of 100 Nones
 
>>> L
 
[None, None, None, None, None, None, None, None, None, None, None, None, None,  
  ...a list of 100 Nones...
]

  >>> type(L)                     # Type s 
  <type ‘list’ >
  >>> type(type(L))              # Even types are object s
 
<type ‘type’ >

{mospagebreak title=How to Break Your Code’s Flexibility}

I’ll have more to say about all these types later in the book, but the last merits a few more words here. The type object allows code to check the types of the objects it uses. In fact, there are at least three ways to do so in a Python script:

  >>> if type(L) ==
type([]):
                           # Type testing, if you must…
           
print ‘yes’

  yes
  >>> if type(L) ==
list:
                               # Using the type name
          print ‘yes’

  yes
  >>> if isinstance(L, list):     # Object-oriented tests
          print ‘yes’

  yes

Now that I’ve shown you all these ways to do type testing, however, I must mention that, as you’ll see later in the book, doing so is almost always the wrong thing to do in a Python program (and often a sign of an ex-C programmer first starting to use Python). By checking for specific types in your code, you effectively break its flexibility—you limit it to working on just one type. Without such tests, your code may be able to work on a whole range of types.

This is related to the idea of polymorphism mentioned earlier, and it stems from Python’s lack of type declarations. As you’ll learn, in Python, we code to object interfaces (operations supported), not to types. Not caring about specific types means that code is automatically applicable to many of them—any object with a compatible interface will work, regardless of its specific type. Although type checking is supported—and even required, in some rare cases—you’ll see that it’s not usually the “Pythonic” way of thinking. In fact, you’ll find that polymorphism is probably the key idea behind using Python well.

User-Defined Classes

We’ll study object-oriented programming in Python—an optional but powerful feature of the language that cuts development time by supporting programming by customization—in depth later in this book. In abstract terms, though, classes define new types of objects that extend the core set, so they merit a passing glance here. Say, for example, that you wish to have a type of object that models employees. Although there is no such specific core type in Python, the following user-defined class might fit the bill:

  >>> class Worker:
        
def __init__(self,
name, pay):
                            # Initialize when created
             
self.name = name       # Self is the new object
             
self.pay = pay
         def lastName(self):
             return
self.name.split()[-1]
                # Split string on blanks
         def
giveRaise(self, percent):
             self.pay *= (1.0 + percent)
                              # Update pay in-place

This class defines a new kind of object that will have name and pay attributes (sometimes called state information), as well as two bits of behavior coded as functions (normally called methods). Calling the class like a function generates instances of our new type, and the class’ methods automatically receive the instance being processed by a given method call (in the self argument):

  >>> bob =
Worker(‘Bob Smith’, 50000)          
# Make two instances
 
>>> sue =
Worker(‘Sue Jones’, 60000)          
# Each has name and pay
 
>>> bob.lastName()                  # Call method: bob is self
 
‘Smith ‘
  >>> sue.lastName()                  # Sue is the self subjec t
 
‘Jones ‘
  >>> sue.giveRaise(.10)              # Updates sue’s pa y
 
>>> sue.pay
  66000.0

The implied “self” object is why we call this an object-oriented model: there is always an implied subject in functions within a class. In a sense, though, the class-based type simply builds on and uses core types—a user-defined Worker object here, for example, is just a collection of a string and number ( name and pay , respectively), plus functions for processing those two built-in objects.

The larger story of classes is that their inheritance mechanism supports software hier archies that lend themselves to customization by extension. We extend software by writing new classes, not by changing what already works. You should also know that classes are an optional feature of Python, and simpler built-in types such as lists and dictionaries are often better tools than user-coded classes. This is all well beyond the bounds of our introductory object-type tutorial, though, so for that tale, you’ll have to read on to a later chapter.

{mospagebreak title=And Everything Else}

As mentioned earlier, everything you can process in a Python script is a type of object, so our object type tour is necessarily incomplete. However, even though everything in Python is an “object,” only those types of objects we’ve met so far are considered part of Python’s core type set. Other object types in Python are usually implemented by module functions, not language syntax. They also tend to have application-specific roles—text patterns, database interfaces, network connections, and so on.

Moreover, keep in mind that the objects we’ve met here are objects, but not necessarily object-oriented—a concept that usually requires inheritance and the Python class statement, which we’ll meet again later in this book. Still, Python’s core objects are the workhorses of almost every Python script you’re likely to meet, and they usually are the basis of larger noncore types.

Chapter Summary

And that’s a wrap for our concise data type tour. This chapter has offered a brief introduction to Python’s core object types, and the sorts of operations we can apply to them. We’ve studied generic operations that work on many object types (sequence operations such as indexing and slicing, for example), as well as type-specific operations available as method calls (for instance, string splits and list appends). We’ve also defined some key terms along the way, such as immutability, sequences, and polymorphism.

Along the way, we’ve seen that Python’s core object types are more flexible and powerful than what is available in lower-level languages such as C. For instance, lists and dictionaries obviate most of the work you do to support collections and searching in lower-level languages. Lists are ordered collections of other objects, and dictionaries are collections of other objects that are indexed by key instead of by position. Both dictionaries and lists may be nested, can grow and shrink on demand, and may contain objects of any type. Moreover, their space is automatically cleaned up as you go.

I’ve skipped most of the details here in order to provide a quick tour, so you shouldn’t expect all of this chapter to have made sense yet. In the next few chapters, we’ll start to dig deeper, filling in details of Python’s core object types that were omitted here so you can gain a more complete understanding. We’ll start off in the next chapter with an in-depth look at Python numbers. First, though, another quiz to review.

BRAIN BUILDER

Chapter Quiz

We’ll explore the concepts introduced in this chapter in more detail in upcoming chapters, so we’ll just cover the big ideas here:

  1. Name four of Python’s core data types.
  2. Why are they called “core” data types?
  3. What does “immutable” mean, and which three of Python’s core types are considered immutable?
  4. What does “sequence” mean, and which three types fall into that category?
  5. What does “mapping” mean, and which core type is a mapping?
  6. What is “polymorphism,” and why should you care?

Quiz Answers

  1. Numbers, strings, lists, dictionaries, tuples, and files are generally considered to be the core object (data) types. Sets, types, None, and Booleans are sometimes classified this way as well. There are multiple number types (integer, long, floating point, and decimal) and two string types (normal and Unicode).

  2. They are known as “core” types because they are part of the Python language itself, and are always available; to create other objects, you generally must call functions in imported modules. Most of the core types have specific syntax for generating the objects: ‘spam, ‘ for example, is an expression that makes a string and determines the set of operations that can be applied to it. Because of this, core types are hardwired into Python’s syntax. In contrast, you must call the built-in open function to create a file object.

  3. An “immutable” object is an object that cannot be changed after it is created. Numbers, strings, and tuples in Python fall into this category. While you cannot change an immutable object in-place, you can always make a new one by running an expression.
  4. A “sequence” is a positionally ordered collection of objects. Strings, lists, and tuples are all sequences in Python. They share common sequence operations, such as indexing, concatenation, and slicing, but also have type-specific method calls.
  5. The term “mapping” denotes an object that maps keys to associated values. Python’s dictionary is the only mapping type in the core type set. Mappings do not maintain any left-to-right positional ordering; they support access to data stored by key, plus type-specific method calls.
  6. “Polymorphism” means that the meaning of an operation (like a + ) depends on the objects being operated on. This turns out to be a key idea (perhaps the key idea) behind using Python well—not constraining code to specific types makes that code automatically applicable to many types.

* In this book, the term literal simply means an expression whose syntax generates an object—sometimes also called a constant. Note that the term “constant” does not imply objects or variables that can never be changed (i.e., this term is unrelated to C++’s const or Python’s “immutable”—a topic explored later in this chapter).

* This matrix structure works for small-scale tasks, but for more serious number crunching, you will probably want to use one of the numeric extensions to Python, such as the open source NumPy system. Such tools
can store and process large matrixes much more efficiently than our nested list structure. NumPy has been said to turn Python into the equivalent of a free and more powerful version of the MatLab system, and organizations such as NASA, Los Alamos, and JPMorgan Chase use this tool for scientific and financial tasks. Search the Web for more details.

* One footnote here: keep in mind that the rec record we just created really could be a database record, when we employ Python’s object persistence system—an easy way to store native Python objects in files or accessby-key databases. We won’t go into more details here; see Python’s pickle and shelve modules for more details.

[gp-comments width="770" linklove="off" ]

chat