Python
  Home arrow Python arrow Page 4 - Working with XML Documents and Python
Dev Shed Forums 
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Moblin 
JMSL Numerical Library 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PYTHON

Working with XML Documents and Python
By: Peyton McCullough
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 13
    2005-11-17

    Table of Contents:
  • Working with XML Documents and Python
  • Organizing a Book Collection
  • Describing a Music Library
  • The Document Object Model

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Working with XML Documents and Python - The Document Object Model


    (Page 4 of 4 )

    SAX is not the only way to process XML data in Python. The Document Object Model exists, and it gives us an object-oriented interface to XML data. Python contains a library called minidom that provides a simple interface to DOM without a whole lot of bells and whistles. Let's recreate our music library parser, making use of DOM this time:

    import xml.dom.minidom

    # Load the music library
    library = xml.dom.minidom.parse ( 'music.xml' )

    # Get a list of the tracks
    tracks = library.documentElement.getElementsByTagName ( 'track' )

    # Go through each track
    for track in tracks:

       # Print each track's information
       print
       print 'Track:  ' + track.childNodes [ 0 ].nodeValue
       print 'Artist: ' + track.attributes [ 'artist' ].nodeValue
       print 'Album:  ' + track.attributes [ 'album' ].nodeValue
       print 'Length: ' + track.attributes [ 'time' ].nodeValue

    We end up with a very short script in the above example. We start by pointing Python to the file we wish to parse. Then we get all tags by the name of “track” that belong to the main tag. We loop through the list provided, printing out the information contained within. To access the text inside the element, we access the track's list of childNodes. The text is stored in a node, and we print out the value of it. The attributes are stored in attributes, and we reference them by name, printing out nodeValue.

    Again, though, what if we don't know everything we are going to parse? Let's recreate the second example of the previous section using DOM:

    import xml.dom.minidom

    # Load the XML file
    library = xml.dom.minidom.parse ( 'music.xml' )

    # Get a list of tracks
    tracks = library.documentElement.getElementsByTagName ( 'track' )

    # Loop through the tracks
    for track in tracks:

       # Print the track name
       print
       print 'Track: ' + track.childNodes [ 0 ].nodeValue

       # Loop through the attributes
       for attribute in track.attributes.keys():
          print attribute [ 0 ].upper() + attribute [ 1: ] + ': ' +
    track.attributes [ attribute ].nodeValue

    We loop through the names of the attributes returned in the keys method, printing the name of the attribute out with the first letter capitalized. We then print the value of the attribute out by referencing the attribute by its key and then accessing nodeValue.

    Let's say we have tags nested in each other. Consider our very first script that parsed a book collection. Let's rebuild it with DOM:

    import xml.dom.minidom

    # Load the book collection
    collection = xml.dom.minidom.parse ( 'collection.xml' )

    # Get a list of books
    books = collection.documentElement.getElementsByTagName
    ( 'book' )

    # Loop through the books
    for book in books:

       # Print out the book's information
       print
       print 'Title:  ' + book.getElementsByTagName ( 'title' )
    [ 0 ].childNodes [ 0 ].nodeValue
       print 'Author: ' + book.getElementsByTagName ( 'author' )
    [ 0 ].childNodes [ 0 ].nodeValue
       print 'Genre:  ' + book.getElementsByTagName ( 'genre' )
    [ 0 ].childNodes [ 0 ].nodeValue

    We load the collection file and then get a list of books. Then, we loop through the list of books and print out what's wrapped inside of each tag, which is in the form of a child node that we must get the value of. It's all very simple.

    DOM includes many more features, but all you need to simply read a document is contained within the minidom module.

    Conclusion

    XML is a useful tool for describing data. It can be used to describe just about anything –- from book collections and music libraries to user settings for an application. Python contains a few utilities that can be used to read and process XML data, namely SAX and DOM. SAX allows you to create handler classes that can process individual items within an XML document, and DOM abstracts the entire document, allowing you to easily navigate through a tree of XML data. Both tools are extremely simple to use and contribute to the phrase “batteries included” that is used to describe the Python language.


    DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.

       · Hopefully this XML and Python article can help those of you in need of some coding...
     

       

    PYTHON ARTICLES

    - SSH with Twisted
    - Mobile Programming in Python using PyS60: UI...
    - Python: Count on It
    - Python Strings: Spinning Yarns
    - Python: More Fun with Strings
    - Python: Stringing You Along
    - Python Operators
    - Bluetooth Programming in Python: Network Pro...
    - Python Sets
    - Python Conditionals, Lists, Dictionaries, an...
    - Python: Input and Variables
    - Introduction to Python Programming
    - Mobile Programming in Python using PyS60: Ge...
    - Bluetooth Programming using Python
    - Finishing the PyMailGUI Client: User Help To...





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway