Home arrow Python arrow Page 3 - String and List Python Object Types

Other Ways to Code Strings - Python

Last week, we introduced you to the different Python object types, starting with numbers. This week, we'll cover strings and begin our discussion of lists. This article, the second in a four-part series, is excerpted from chapter four of the book Learning Python, Third Edition, written by Mark Lutz (O'Reilly, 2008; ISBN: 0596513984). Copyright © 2008 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

TABLE OF CONTENTS:
  1. String and List Python Object Types
  2. Immutability
  3. Other Ways to Code Strings
  4. Lists
By: O'Reilly Media
Rating: starstarstarstarstar / 4
January 22, 2009

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

So far, we’ve looked at the string object’s sequence operations and type-specific methods. Python also provides a variety of ways for us to code strings, which we’ll explore further later (with special characters represented as backslash escape sequences, for instance):

  >>> S = 'A\nB\tC'     # \n is end-of-line, \t is tab
 
>>> len(S)            # Each stands for just one character
 
5

  >>> ord('\n')          # \n is a byte with the binary value 10 in ASCII
 
10

  >>> S = 'A\0B\0C'     # \0, the binary zero byte, does not terminate the string
 
>>>
len(S)
  5

Python allows strings to be enclosed in single or double quote characters (they mean the same thing). It also has a multiline string literal form enclosed in triple quotes (single or double)—when this form is used, all the lines are concatenated together, and end-of-line characters are added where line breaks appear. This is a minor syntactic convenience, but it’s useful for embedding things like HTML and XML code in a Python script:

  >>> msg = """
  aaaaaaaaaaaaa 
  bbb'''bbbbbbbbbb""bbbbbbb'bbbb
  
  cccccccccccccc"""
 
>>> msg 
 '\naaaaaaaaaaaaa\nbbb\'\'\'bbbbbbbbbb""bbbbbbb\ 'bbbb\ncccccccccccccc'

 

Python also supports a “raw” string literal that turns off the backslash escape mechanism (they start with the letter r), as well as a Unicode string form that supports internationalization (they begin with the letter u and contain multibyte characters). Technically, Unicode string is a different data type than normal string, but it supports all the same string operations. We’ll meet all these special string forms in later chapters.

Pattern Matching

One point worth noting before we move on is that none of the string object’s methods support pattern-based text processing. Text pattern matching is an advanced tool outside this book’s scope, but readers with backgrounds in other scripting languages may be interested to know that to do pattern matching in Python, we import a module called re. This module has analogous calls for searching, splitting, and replacement, but because we can use patterns to specify substrings, we can be much more general:

  >>> import re
 
>>> match = re.match('Hello[ \t]*(.*)world', 'Hello  Python world')
 
>>> match.group(1)
 
'Python'

This example searches for a substring that begins with the word “Hello,” followed by zero or more tabs or spaces, followed by arbitrary characters to be saved as a matched group, terminated by the word “world.” If such as substring is found, portions of the substring matched by parts of the pattern enclosed in parentheses are available as groups. The following pattern, for example, picks out three groups separated by slashes:

  >>> match = re.match('/(.*)/(.*)/(.*)', '/usr/home/lumberjack')
  >>> match.groups()
  ('usr', 'home', 'lumberjack')

Pattern matching is a fairly advanced text-processing tool by itself, but there is also support in Python for even more advanced language processing, including natural language processing. I’ve already said enough about strings for this tutorial, though, so let’s move on to the next type.



 
 
>>> More Python Articles          >>> More By O'Reilly Media
 

blog comments powered by Disqus
   

PYTHON ARTICLES

- Python Big Data Company Gets DARPA Funding
- Python 32 Now Available
- Final Alpha for Python 3.2 is Released
- Python 3.1: String Formatting
- Python 3.1: Strings and Quotes
- Python 3.1: Programming Basics and Strings
- Tuples and Other Python Object Types
- The Dictionary Python Object Type
- String and List Python Object Types
- Introducing Python Object Types
- Mobile Programming using PyS60: Advanced UI ...
- Nested Functions in Python
- Python Parameters, Functions and Arguments
- Python Statements and Functions
- Statements and Iterators in Python

Developer Shed Affiliates

 



© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap

Dev Shed Tutorial Topics: