SunQuest
 
       Python
  Home arrow Python arrow Page 3 - Working with XML Documents and Python
Click Here
Click Here
Dev Shed Forums 
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Actuate Whitepapers 
VeriSign Whitepapers 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
SunQuest
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PYTHON

Working with XML Documents and Python
By: Peyton McCullough
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 13
    2005-11-17

    Table of Contents:
  • Working with XML Documents and Python
  • Organizing a Book Collection
  • Describing a Music Library
  • The Document Object Model

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT

    At the virtual BlackBerry Technical Seminar 2008, you can ask your development questions directly of Research In Motion® (RIM) experts, and take advantage of learning opportunities designed uniquely for BlackBerry solution developers. Register Today!

    Working with XML Documents and Python - Describing a Music Library


    (Page 3 of 4 )

    While the above XML structure is fine for many things, compacting things into attributes is often very helpful and a much better idea. Let's consider a music library (consisting of individual songs rather than whole albums). Instead of creating indivual tags for the album the song comes from, the artist of the song, the name of the song and length of the song, which would get tiresome to type, we could simply create attributes for each of these properties. Here's an XML file that does this:

    <?xml version="1.0" encoding="UTF-8"?>
    <library>
       <track artist='Boston' album='Greatest Hits' time='5:04'>Peace
    of Mind</track>
       <track artist='Dire Straits' album='Sultans Of Swing-Best of
    Dire Straits' time='5:50'>Sultans of Swing</track>
       <track artist='Dire Straits' album='Sultans Of Swing-Best of
    Dire Straits' time='4:12'>Walk of Life</track>
       <track artist='The Eagles' album='The Very Best of The Eagles'
    time='3:33'>Take It Easy</track>
       <track artist='Gary Allan' album='Best I Ever Had'
    time='4:18'>Best I Ever Had</track>
       <track artist='Goo Goo Dolls' album='Dizzy Up the Girl'
    time='4:50'>Iris</track>
       <track artist='Kansas' album='The Ultimate Kansas'
    time='3:24'>Dust in the Wind</track>
       <track artist='Toby Keith' album='Greatest Hits 2'
    time='3:27'>How Do You Like Me Now?!</track>
       <track artist='Toby Keith' album='Greatest Hits 2'
    time='3:15'>Courtest of the Red, White and Blue</track>
       <track artist='ZZ Top' album='ZZ Top-Greatest Hits'
    time='4:03'>Got Me Under Pressure</track>
    </library>

    If we had given every property of the song its own tag, then the file would have been much longer than it is now. However, we are able to reduce the length of the file by putting properties in attributes.

    Now, however, we are left with the task of parsing the data. The process is similar to what we did above, but there are some differences since we are dealing with attributes this time around. Here's how it all works:

    import xml.sax

    # Create a class to handle the contents of the XML file
    class HandleLibrary ( xml.sax.ContentHandler ):

       def __init__ ( self ):

          self.artist = ''
          self.album = ''
          self.time = ''

       # Handle the start of an element
       def startElement ( self, name, attributes ):

          # Check to see if it is a "track" element
          # If so, store the attributes
          if name == 'track':
              self.artist = attributes.getValue ( 'artist' )
              self.album = attributes.getValue ( 'album' )
              self.time = attributes.getValue ( 'time' )

       # Handle content
       def characters ( self, content ):

          # If the content isn't a newline or blank space, then we
    have the track name
          # Print out all the track info
          if ( content != '\n' ) and ( content.replace ( ' ', '' ) !=
    '' ):
             print
             print 'Track:  ' + content
             print 'Artist: ' + self.artist
             print 'Album:  ' + self.album
             print 'Length: ' + self.time

    # Parse the file
    parser = xml.sax.make_parser()
    parser.setContentHandler ( HandleLibrary() )
    parser.parse ( 'music.xml' )

    It's not a very complex script, and it's not very lengthy. It strongly resembles the previous script, but note that we choose not to store anything in a dictionary. Rather, we just dump it all out to the user as we receive it. The attributes variable in the startElement method is an object representing all the attributes of that tag. We then access the attributes by name with the getValue method, saving the values in variables that we print out in the characters method. That's all there is to parsing attributes.

    What if, however, we do not know the names of the attributes? It isn't too much of a problem, since we can loop through all the attributes and get their values:

    import xml.sax

    # Create a class to handle the XML
    class HandleLibrary ( xml.sax.ContentHandler ):

       def __init__ ( self ):

          self.attributes = None

       # Handle the beginning part of each tag
       def startElement ( self, name, attributes ):

          # Check to see if we're dealing with a track tag
          # If we are, store the attributes
          if name == 'track':
             self.attributes = attributes

       # Handle content
       def characters ( self, content ):

          if ( content != '\n' ) and ( content.replace ( ' ', '' ) != '' ):

             # Loop through each attribute and print the name and value
             print
             print 'Track: ' + content
             for attribute in self.attributes.getNames():
                print attribute [ 0 ].upper() + attribute [ 1: ] + ': ' + self.attributes.getValue ( attribute )
             print

    # Parse it all
    parser = xml.sax.make_parser()
    parser.setContentHandler ( HandleLibrary() )
    parser.parse ( 'music.xml' )

    In the above script, we use the getNames method to retrieve a list of attribute names. We loop through the list and print each attribute's name (with the first letter capitalized) and value.

    More Python Articles
    More By Peyton McCullough


       · Hopefully this XML and Python article can help those of you in need of some coding...
     

       

    PYTHON ARTICLES

    - SSH with Twisted
    - Mobile Programming in Python using PyS60: UI...
    - Python: Count on It
    - Python Strings: Spinning Yarns
    - Python: More Fun with Strings
    - Python: Stringing You Along
    - Python Operators
    - Bluetooth Programming in Python: Network Pro...
    - Python Sets
    - Python Conditionals, Lists, Dictionaries, an...
    - Python: Input and Variables
    - Introduction to Python Programming
    - Mobile Programming in Python using PyS60: Ge...
    - Bluetooth Programming using Python
    - Finishing the PyMailGUI Client: User Help To...

    SunQuest




    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway