The Simple API for XML (SAX) is just one approach to parsing XML. An alternative approach is the Document Object Model (DOM), which builds a data tree in memory for easier, non-sequential access to XML data fragments. In this article, find out how to combine the Java-based Xerces parser with the DOM to create simple Java/XML applications.
If you're at all familiar with XML programming, you'll be aware that there are two basic approaches to parsing an XML document. The Simple API for XML (SAX) is one; it parses an XML document in a sequential manner, generating and throwing events for the application layer to process as it encounters different XML elements. This sequential approach enables rapid parsing of XML data, especially in the case of long or complex XML documents; however, the downside is that a SAX parser cannot be used to access XML document nodes in a random or non-sequential manner.
Hence the Document Object Model (DOM). This alternative approach involves building a tree representation of the XML document in memory, and then using built-in methods to navigate through this tree. Once a particular node has been reached, built-in properties can be used to obtain the value of the node, and use it within the script. This tree-based paradigm does away with the problems inherent in SAX's sequential approach, allowing for immediate random access to any node or collection of nodes in the tree.
DOM parsers are available for a variety of different platforms - you can get them for Perl, PHP, Python and C. However, the one that's going to occupy us for the next ten minutes is the Java-based Xerces parser, which includes a very capable DOM parser. Over the next few pages, I'm going to be demonstrating how it works, using some real-world examples to bring home its capabilities and to illustrate how the combination of Java, XML and JSP can be used to develop XML-based Web applications.