Home arrow Python arrow Page 3 - Data Types in Python

Strings - Python

In this second of nine parts focusing on a quick overview of the Python language for experienced programmers, you'll learn how Python handles data types such as strings, and more. This article is excerpted from chapter four of Python in a Nutshell, Second Edition, written by Alex Martelli (O'Reilly; ISBN: 0596100469). Copyright 2007 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

  1. Data Types in Python
  2. Complex numbers
  3. Strings
  4. Tuples
By: O'Reilly Media
Rating: starstarstarstarstar / 4
September 18, 2008

print this article



A built-in string object (plain or Unicode) is a sequence of characters used to store and represent text-based information (plain strings are also sometimes used to store and represent arbitrary sequences of binary bytes). Strings in Python are immutable, meaning that when you perform an operation on strings, you always produce a new string object, rather than mutating an existing string. String objects provide many methods, as discussed in detail in "Methods of String Objects" on page 186.

A string literal can be quoted or triple-quoted. A quoted string is a sequence of zero or more characters enclosed in matching quotes, single (') or double ("). For example:

  'This is a literal string'
  "This is another string"

The two different kinds of quotes function identically; having both allows you to include one kind of quote inside of a string specified with the other kind without needing to escape them with the backslash character (\):

  'I\'m a Python fanatic'      # a quote can be escaped
  "I'm a Python fanatic"       # this way is more readable

All other things being equal, using single quotes to denote string literals is a more common Python style. To have a string literal span multiple physical lines, you can use a backslash as the last character of a line to indicate that the next line is a continuation:

  "A not very long string\
  that spans two lines"        # comment not allowed on previous line

To make the string output on two lines, you can embed a newline in the string:

  "A not very long string\n\
  that prints on two lines"    # comment not allowed on previous line

A better approach is to use a triple-quoted string, which is enclosed by matching triplets of quote characters (''' or """):

  """An even bigger
  string that spans
  three lines"""               # comments not allowed on previous lines

In a triple-quoted string literal, line breaks in the literal are preserved as newline characters in the resulting string object.

The only character that cannot be part of a triple-quoted string is an unescaped backslash, while a quoted string cannot contain unescaped backslashes, nor line ends, nor the quote character that encloses it. The backslash character starts an escape sequence, which lets you introduce any character in either kind of string. Python's string escape sequences are listed in Table 4-1.

Table 4-1. String escape sequences

Sequence Meaning ASCII/ISO code
\<newline> End of line is ignored None
\\ Backslash 0x5c
\' Single quote 0x27
" Double quote 0x22
\a Bell 0x07
\b Backspace 0x08
\f Form feed 0x0c
\n Newline 0x0a
\r Carriage return 0x0d
\t Tab 0x09
\v Vertical tab 0x0b
\DDD Octal value DDD As given
\xXX Hexadecimal value XX As given
\other Any other character 0x5c + as given

A variant of a string literal is a raw string. The syntax is the same as for quoted or triple-quoted string literals, except that an r or R immediately precedes the leading quote. In raw strings, escape sequences are not interpreted as in Table 4-1, but are literally copied into the string, including backslashes and newline characters. Raw string syntax is handy for strings that include many backslashes, as in regular expressions (see "Pattern-String Syntax" on page 201). A raw string cannot end with an odd number of backslashes; the last one would be taken as escaping the terminating quote.

Unicode string literals have the same syntax as other string literals, with a u or U immediately before the leading quote. Unicode string literals can use \u followed by four hex digits to denote Unicode characters and can include the escape sequences listed in Table 4-1. Unicode literals can also include the escape sequence \N{name}, where name  is a standard Unicode name, as listed at http:// www.unicode.org/charts/. For example, \N{Copyright Sign} indicates a Unicode copyright sign character (). Raw Unicode string literals start with ur, not ru. Note that raw strings are not a different type from ordinary strings: raw strings are just an alternative syntax for literals of the usual two string types, plain (a.k.a. byte strings) and Unicode.

Multiple string literals of any kind (quoted, triple-quoted, raw, Unicode) can be adjacent, with optional whitespace in between. The compiler concatenates such adjacent string literals into a single string object. If any literal in the concatenation is Unicode, the whole result is Unicode. Writing a long string literal in this way lets you present it readably across multiple physical lines and gives you an opportunity to insert comments about parts of the string. For example:

  marypop = ('supercalifragilistic'   # Open paren -> logical line continues
            'expialidocious')         # Indentation ignored in continuation

The string assigned to marypop is a single word of 34 characters.

>>> More Python Articles          >>> More By O'Reilly Media

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort


- Python Big Data Company Gets DARPA Funding
- Python 32 Now Available
- Final Alpha for Python 3.2 is Released
- Python 3.1: String Formatting
- Python 3.1: Strings and Quotes
- Python 3.1: Programming Basics and Strings
- Tuples and Other Python Object Types
- The Dictionary Python Object Type
- String and List Python Object Types
- Introducing Python Object Types
- Mobile Programming using PyS60: Advanced UI ...
- Nested Functions in Python
- Python Parameters, Functions and Arguments
- Python Statements and Functions
- Statements and Iterators in Python

Developer Shed Affiliates


Dev Shed Tutorial Topics: