Home arrow Python arrow Page 3 - Working with XML Documents and Python

Describing a Music Library - Python

XML can be used for describing data without needing a database. However, this leaves us with the problem of interpreting the data embedded within the XML. This is where Python comes to the rescue, as Peyton explains.

TABLE OF CONTENTS:
  1. Working with XML Documents and Python
  2. Organizing a Book Collection
  3. Describing a Music Library
  4. The Document Object Model
By: Peyton McCullough
Rating: starstarstarstarstar / 17
November 17, 2005

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

While the above XML structure is fine for many things, compacting things into attributes is often very helpful and a much better idea. Let's consider a music library (consisting of individual songs rather than whole albums). Instead of creating indivual tags for the album the song comes from, the artist of the song, the name of the song and length of the song, which would get tiresome to type, we could simply create attributes for each of these properties. Here's an XML file that does this:

<?xml version="1.0" encoding="UTF-8"?>
<library>
   <track artist='Boston' album='Greatest Hits' time='5:04'>Peace
of Mind</track>
   <track artist='Dire Straits' album='Sultans Of Swing-Best of
Dire Straits' time='5:50'>Sultans of Swing</track>
   <track artist='Dire Straits' album='Sultans Of Swing-Best of
Dire Straits' time='4:12'>Walk of Life</track>
   <track artist='The Eagles' album='The Very Best of The Eagles'
time='3:33'>Take It Easy</track>
   <track artist='Gary Allan' album='Best I Ever Had'
time='4:18'>Best I Ever Had</track>
   <track artist='Goo Goo Dolls' album='Dizzy Up the Girl'
time='4:50'>Iris</track>
   <track artist='Kansas' album='The Ultimate Kansas'
time='3:24'>Dust in the Wind</track>
   <track artist='Toby Keith' album='Greatest Hits 2'
time='3:27'>How Do You Like Me Now?!</track>
   <track artist='Toby Keith' album='Greatest Hits 2'
time='3:15'>Courtest of the Red, White and Blue</track>
   <track artist='ZZ Top' album='ZZ Top-Greatest Hits'
time='4:03'>Got Me Under Pressure</track>
</library>

If we had given every property of the song its own tag, then the file would have been much longer than it is now. However, we are able to reduce the length of the file by putting properties in attributes.

Now, however, we are left with the task of parsing the data. The process is similar to what we did above, but there are some differences since we are dealing with attributes this time around. Here's how it all works:

import xml.sax

# Create a class to handle the contents of the XML file
class HandleLibrary ( xml.sax.ContentHandler ):

   def __init__ ( self ):

      self.artist = ''
      self.album = ''
      self.time = ''

   # Handle the start of an element
   def startElement ( self, name, attributes ):

      # Check to see if it is a "track" element
      # If so, store the attributes
      if name == 'track':
          self.artist = attributes.getValue ( 'artist' )
          self.album = attributes.getValue ( 'album' )
          self.time = attributes.getValue ( 'time' )

   # Handle content
   def characters ( self, content ):

      # If the content isn't a newline or blank space, then we
have the track name
      # Print out all the track info
      if ( content != '\n' ) and ( content.replace ( ' ', '' ) !=
'' ):
         print
         print 'Track:  ' + content
         print 'Artist: ' + self.artist
         print 'Album:  ' + self.album
         print 'Length: ' + self.time

# Parse the file
parser = xml.sax.make_parser()
parser.setContentHandler ( HandleLibrary() )
parser.parse ( 'music.xml' )

It's not a very complex script, and it's not very lengthy. It strongly resembles the previous script, but note that we choose not to store anything in a dictionary. Rather, we just dump it all out to the user as we receive it. The attributes variable in the startElement method is an object representing all the attributes of that tag. We then access the attributes by name with the getValue method, saving the values in variables that we print out in the characters method. That's all there is to parsing attributes.

What if, however, we do not know the names of the attributes? It isn't too much of a problem, since we can loop through all the attributes and get their values:

import xml.sax

# Create a class to handle the XML
class HandleLibrary ( xml.sax.ContentHandler ):

   def __init__ ( self ):

      self.attributes = None

   # Handle the beginning part of each tag
   def startElement ( self, name, attributes ):

      # Check to see if we're dealing with a track tag
      # If we are, store the attributes
      if name == 'track':
         self.attributes = attributes

   # Handle content
   def characters ( self, content ):

      if ( content != '\n' ) and ( content.replace ( ' ', '' ) != '' ):

         # Loop through each attribute and print the name and value
         print
         print 'Track: ' + content
         for attribute in self.attributes.getNames():
            print attribute [ 0 ].upper() + attribute [ 1: ] + ': ' + self.attributes.getValue ( attribute )
         print

# Parse it all
parser = xml.sax.make_parser()
parser.setContentHandler ( HandleLibrary() )
parser.parse ( 'music.xml' )

In the above script, we use the getNames method to retrieve a list of attribute names. We loop through the list and print each attribute's name (with the first letter capitalized) and value.



 
 
>>> More Python Articles          >>> More By Peyton McCullough
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PYTHON ARTICLES

- Python Big Data Company Gets DARPA Funding
- Python 32 Now Available
- Final Alpha for Python 3.2 is Released
- Python 3.1: String Formatting
- Python 3.1: Strings and Quotes
- Python 3.1: Programming Basics and Strings
- Tuples and Other Python Object Types
- The Dictionary Python Object Type
- String and List Python Object Types
- Introducing Python Object Types
- Mobile Programming using PyS60: Advanced UI ...
- Nested Functions in Python
- Python Parameters, Functions and Arguments
- Python Statements and Functions
- Statements and Iterators in Python

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: