Python
  Home arrow Python arrow Page 2 - Working with XML Documents and Python
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PYTHON

Working with XML Documents and Python
By: Peyton McCullough
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 13
    2005-11-17


    Table of Contents:
  • Working with XML Documents and Python
  • Organizing a Book Collection
  • Describing a Music Library
  • The Document Object Model

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Working with XML Documents and Python - Organizing a Book Collection
    ( Page 2 of 4 )

    Let's say we want to organize a book collection using XML to describe it all. We don't need anything fancy. We only need to store the title, author, and genre of the book. Let's go ahead and create the markup for a few books:

    <?xml version="1.0" encoding="UTF-8"?>
    <collection>
       <book>
          <title>The Once and Future King</title>
          <author>T.H. White</author>
          <genre>Fantasy</genre>
       </book>
       <book>
          <title>The Curse of Chalion</title>
          <author>Lois McMaster Bujold</author>
          <genre>Fantasy</genre>
       </book>
       <book>
          <title>Paladin of Souls</title>
          <author>Lois McMaster Bujold</author>
          <genre>Fantasy</genre>
       </book>
       <book>
          <title>Alas, Babylon</title>
          <author>Pat Frank</author>
          <genre>Fiction</genre>
       </book>
       <book>
          <title>Rifles for Wattie</title>
          <author>Harold Keith</author>
          <genre>Fiction</genre>
       </book>
    </collection>

    Now we're left with parsing the data and turning it into something presentable. If you examine the way the data is stored, you will notice that it is similar to a dictionary in Python. Therefore, a dictionary would be an ideal type to store the data in. We'll create a chunk of code that does just this. SAX, Simple API for XML, will be used for this project. It is contained in xml.sax:

    import xml.sax

    # Create a collection list
    collection = []

    # This handles the parsing of the content
    class HandleCollection ( xml.sax.ContentHandler ):

       def __init__ ( self ):

          self.book = {}
          self.title = False
          self.author = False
          self.genre = False

       # Called at the start of an element
       def startElement ( self, name, attributes ):

          if name == 'title':
             self.title = True
          elif name == 'author':
             self.author = True
          elif name == 'genre':
             self.genre = True

       # Called at the end of an element
       def endElement ( self, name ):

          if name == 'book':
             collection.append ( self.book )
             self.book = {}
          elif name == 'title':
             self.title = False
          elif name == 'author':
             self.author = False
          elif name == 'genre':
             self.genre = False

       # Called to handle content besides elements
       def characters ( self, content ):

          if self.title:
             self.book [ 'title' ] = content
          elif self.author:
             self.book [ 'author' ] = content
          elif self.genre:
             self.book [ 'genre' ] = content

    # Parse the collection
    parser = xml.sax.make_parser()
    parser.setContentHandler ( HandleCollection() )
    parser.parse ( 'collection.xml' )

    As you can see, there's really not much work involved. All we have to do is write the instructions that organize each book into a dictionary and put all the dictionaries into the collection list. We start by subclassing xml.sax.ContentHandler. The class we create is charged with handling the content of the document we parse. In our class's __init__ method, we define a few variables. The book dictionary will, of course, house the book's information. The title variable will be used by the characters method to determine whether we are dealing with the title tag's content. The same goes for the author variable and the genre variable. These are set to True in startElement if we're dealing with that particular element. They are then set to False when we have finished using them in endElement. Finally, we instruct Python to parse the file in the last three lines.

    We are now free to present this information to the user in whichever way we see fit. For example, if we wanted to just output the book information without dressing it up too much, we could simply append some code to the above script that sorts through the list of dictionaries that the script creates:

    for book in collection:
       print
       print 'Title:  ', book [ 'title' ]
       print 'Author: ', book [ 'author' ]
       print 'Genre:  ', book [ 'genre' ]



     
     
    >>> More Python Articles          >>> More By Peyton McCullough
     

       

    PYTHON ARTICLES

    - Tuples and Other Python Object Types
    - The Dictionary Python Object Type
    - String and List Python Object Types
    - Introducing Python Object Types
    - Mobile Programming using PyS60: Advanced UI ...
    - Nested Functions in Python
    - Python Parameters, Functions and Arguments
    - Python Statements and Functions
    - Statements and Iterators in Python
    - Sequences and Sets in Python
    - Python Expressions and Operators
    - Dictionaries, Variables and Statements in Py...
    - Data Types in Python
    - The Python Language
    - SSH with Twisted





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
    Stay green...Green IT