Integrating XML with J2EE - Parsing XML (
Page 9 of 14 )
So far, you have used Internet Explorer or other third-party tools to parse
your XML documents. Now you will look at three APIs that provide a way to access
and manipulate the information stored in an XML document so you can build your
own XML applications.
The Simple API for XML (SAX) defines parsing methods and Document Object
Model (DOM) defines a mechanism for accessing and manipulating well-formed XML.
A third API is the Java API for XML Processing (JAXP) that you will use to build
a simple SAX and DOM parser. The two parsers you will develop effectively echo
the input XML structure. Usually, you will want to parse XML to perform some
useful function, but simply echoing the XML is a good way to learn the APIs.
JAXP has the benefit that it provides a common interface for creating and
using SAX and DOM in Java.
SAX and DOM define different approaches to parsing and handling an XML
document. SAX is an event-based API, whereas DOM is tree-based.
With event-based parsers, the parsing events (such as the start and end tags)
are reported directly to the application through callback methods. The
application implements these callback methods to handle the different components
in the document, much like handling events in a graphical user interface
(GUI).
Using the DOM API, you will transform the XML document into a tree structure
in memory. The application then navigates the tree to parse the document.
Each method has its advantages and disadvantages. Using DOM
-
Simplifies the mapping of the structure of the XML.
-
Is a good choice when the document is not too large. If the document is
large, it can place a strain on system resources.
-
Most, or all, of the document needs to be parsed.
-
The document is to be altered or written out in a structure that is very
different from the original.
Using SAX is a good choice
-
If you are searching through an XML document for a small number of tags
-
The document is large
-
Processing speed is important
-
If the document does not need to be written out in a structure that is
different from the original
SAX is a public domain API developed cooperatively by the members of the
XML-DEV (XML DEVelopment) Internet discussion group (http://www.xml.org/).
The DOM is a set of interfaces defined by the W3C DOM Working Group. The
latest DOM recommendation can be obtained from the W3C Web site (http://www.w3.org).
The JAXP Packages
The JAXP APIs are defined in the J2SDK 1.4 javax.xml.parsers
package, which contains two factory classes—SAXParserFactory and
DocumentBuilderFactory.
The packages that define the SAX and DOM APIs are
You will now build two applications—one that uses the SAX API and one that
uses DOM.
|
This chapter is from Teach Yourself
J2EE in 21 Days, second edition, by Martin Bond et. al. (Sams,
2004, ISBN: 0-672-32558-6). Check it out at your favorite bookstore today. Buy
this book now.
|