XML Parsing With SAX and Xerces (part 1) (
Page 1 of 6 )
So you've already seen how Perl and PHP handle XML data. But you're a Real Programmer,
and Real Programmers don't waste time with scripting languages. Nope, you need
something a little more powerful, something with more horsepower under the hood.
Something written in Java. Something like Xerces.If you've been paying attention over the last few weeks, you'll already know
a little bit about XML, and how it hopes to alter the way data is classified and
used on the Web. By marking up data fragments with HTML-like tags and attributes,
XML provides the content author with an efficient and simple method of describing
data...and the Web developer with a powerful new weapon to add to his or her arsenal.
Now, XML data is physically stored in text files, as pure ASCII. As a format,
this is as close to universal as you can get - every computer system on planet
Earth can read and process ASCII text, making XML extremely portable between platforms
and systems. Tie this in with that other platform-independent language, Java,
and you have a marriage made in cross-platform heaven.
Over the course of this two-part article, I'll be examining the union of Java
and XML, illustrating how the two technologies can be combined to easily parse
XML data and convert it into browser-friendly HTML. My tool in this endeavour
will be the Xerces XML parser, a validating Java-based parser which supports the
XML 1.0, DOM Level 2, SAX 1.0 and 2.0 and XML Schema standards. Highly configurable,
and with a rich feature set, Xerces is a part of the Apache XML Project, and is
designed to meet the twin standards of performance and compatibility when parsing
XML documents.
I'll try and keep it simple - I'm going to use very simple XML sources, so you
don't have to worry about namespaces, DTDs and PIs - although I will assume that
you know the basic rules of XML markup, and of Java programming. So let's get
this show on the road.