Python
  Home arrow Python arrow Page 4 - Working with XML Documents and Python
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
PYTHON

Working with XML Documents and Python
By: Peyton McCullough
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 14
    2005-11-17


    Table of Contents:
  • Working with XML Documents and Python
  • Organizing a Book Collection
  • Describing a Music Library
  • The Document Object Model

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Working with XML Documents and Python - The Document Object Model
    ( Page 4 of 4 )

    SAX is not the only way to process XML data in Python. The Document Object Model exists, and it gives us an object-oriented interface to XML data. Python contains a library called minidom that provides a simple interface to DOM without a whole lot of bells and whistles. Let's recreate our music library parser, making use of DOM this time:

    import xml.dom.minidom

    # Load the music library
    library = xml.dom.minidom.parse ( 'music.xml' )

    # Get a list of the tracks
    tracks = library.documentElement.getElementsByTagName ( 'track' )

    # Go through each track
    for track in tracks:

       # Print each track's information
       print
       print 'Track:  ' + track.childNodes [ 0 ].nodeValue
       print 'Artist: ' + track.attributes [ 'artist' ].nodeValue
       print 'Album:  ' + track.attributes [ 'album' ].nodeValue
       print 'Length: ' + track.attributes [ 'time' ].nodeValue

    We end up with a very short script in the above example. We start by pointing Python to the file we wish to parse. Then we get all tags by the name of “track” that belong to the main tag. We loop through the list provided, printing out the information contained within. To access the text inside the element, we access the track's list of childNodes. The text is stored in a node, and we print out the value of it. The attributes are stored in attributes, and we reference them by name, printing out nodeValue.

    Again, though, what if we don't know everything we are going to parse? Let's recreate the second example of the previous section using DOM:

    import xml.dom.minidom

    # Load the XML file
    library = xml.dom.minidom.parse ( 'music.xml' )

    # Get a list of tracks
    tracks = library.documentElement.getElementsByTagName ( 'track' )

    # Loop through the tracks
    for track in tracks:

       # Print the track name
       print
       print 'Track: ' + track.childNodes [ 0 ].nodeValue

       # Loop through the attributes
       for attribute in track.attributes.keys():
          print attribute [ 0 ].upper() + attribute [ 1: ] + ': ' +
    track.attributes [ attribute ].nodeValue

    We loop through the names of the attributes returned in the keys method, printing the name of the attribute out with the first letter capitalized. We then print the value of the attribute out by referencing the attribute by its key and then accessing nodeValue.

    Let's say we have tags nested in each other. Consider our very first script that parsed a book collection. Let's rebuild it with DOM:

    import xml.dom.minidom

    # Load the book collection
    collection = xml.dom.minidom.parse ( 'collection.xml' )

    # Get a list of books
    books = collection.documentElement.getElementsByTagName
    ( 'book' )

    # Loop through the books
    for book in books:

       # Print out the book's information
       print
       print 'Title:  ' + book.getElementsByTagName ( 'title' )
    [ 0 ].childNodes [ 0 ].nodeValue
       print 'Author: ' + book.getElementsByTagName ( 'author' )
    [ 0 ].childNodes [ 0 ].nodeValue
       print 'Genre:  ' + book.getElementsByTagName ( 'genre' )
    [ 0 ].childNodes [ 0 ].nodeValue

    We load the collection file and then get a list of books. Then, we loop through the list of books and print out what's wrapped inside of each tag, which is in the form of a child node that we must get the value of. It's all very simple.

    DOM includes many more features, but all you need to simply read a document is contained within the minidom module.

    Conclusion

    XML is a useful tool for describing data. It can be used to describe just about anything –- from book collections and music libraries to user settings for an application. Python contains a few utilities that can be used to read and process XML data, namely SAX and DOM. SAX allows you to create handler classes that can process individual items within an XML document, and DOM abstracts the entire document, allowing you to easily navigate through a tree of XML data. Both tools are extremely simple to use and contribute to the phrase “batteries included” that is used to describe the Python language.



     
     
    >>> More Python Articles          >>> More By Peyton McCullough
     

       

    PYTHON ARTICLES

    - Tuples and Other Python Object Types
    - The Dictionary Python Object Type
    - String and List Python Object Types
    - Introducing Python Object Types
    - Mobile Programming using PyS60: Advanced UI ...
    - Nested Functions in Python
    - Python Parameters, Functions and Arguments
    - Python Statements and Functions
    - Statements and Iterators in Python
    - Sequences and Sets in Python
    - Python Expressions and Operators
    - Dictionaries, Variables and Statements in Py...
    - Data Types in Python
    - The Python Language
    - SSH with Twisted





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek