Home arrow XML arrow Page 3 - XML Parsing With DOM and Xerces (part 1)

Nailguns, Going Cheap - XML

The Simple API for XML (SAX) is just one approach to parsing XML. An alternative approach is the Document Object Model (DOM), which builds a data tree in memory for easier, non-sequential access to XML data fragments. In this article, find out how to combine the Java-based Xerces parser with the DOM to create simple Java/XML applications.

TABLE OF CONTENTS:
  1. XML Parsing With DOM and Xerces (part 1)
  2. Float Like A Butterfly...
  3. Nailguns, Going Cheap
  4. Delving Deeper
  5. When Laziness Is A Virtue
By: icarus, (c) Melonfire
Rating: starstarstarstarstar / 40
February 19, 2002

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement
I'll begin with something simple. Consider the following XML file, an XML-encoded inventory statement for a business selling equipment to Quake enthusiasts.

<?xml version="1.0"?> <inventory> <item> <id>758</id> <name>Rusty, jagged nails for nailgun</name> <supplier>NailBarn, Inc.</supplier> <cost>2.99</cost> <quantity>10000</quantity> </item> <item> <id>6273</id> <name>Power pack for death ray</name> <supplier>QuakePower.domain.com</supplier> <cost>9.99</cost> <quantity>10</quantity> </item> </inventory>
The Xerces DOM parser is designed to read an XML file, build a tree to represent the structures found within it, and expose object methods and properties to manipulate them. This next example demonstrates how, building a simple Java application that initializes the parser and reads the XML file.

import org.apache.xerces.parsers.DOMParser; import org.w3c.dom.*; import java.io.*; public class MyFirstDomApp { // constructor public MyFirstDomApp (String xmlFile) { // create a DOM parser DOMParser parser = new DOMParser(); // parse the document try { parser.parse(xmlFile); Document document = parser.getDocument(); NodeDetails(document); } catch (IOException e) { System.err.println (e); } } // this function prints out information on a specific node // in this example, the "#document" node // it then goes to the next node // and does the same for that private void NodeDetails (Node node) { System.out.println ("Node Type:" + node.getNodeType() + "nNode Name:" + node.getNodeName()); if(node.hasChildNodes()) { System.out.println ("Child Node Type:" + node.getFirstChild().getNodeType() + "nNode Name:" + node.getFirstChild().getNodeName()); } } // the main method to create an instance of our DOM application public static void main (String[] args) { MyFirstDomApp MyFirstDomApp = new MyFirstDomApp (args[0]); } }
I'll explain what all this gobbledygook means shortly - but first, let's compile and run the code.

$ javac MyFirstDomApp.java
Assuming that all goes well, you should now have a class file named "MyFirstDomApp.class". Copy this class file to your Java CLASSPATH, and then execute it, with the name of the XML file as argument.

$ java MyFirstDomApp /home/me/dom/inventory.xml
Here's what the output looks like:

Node Type:9 Node Name:#document Child Node Type:1 Node Name:inventory
Now, this might not look like much, but it demonstrates the basic concept of the DOM, and builds the foundation for more complex code. Let's look at the code in detail:

1. The first step is to import all the classes required to execute the application. First come the classes for the Xerces DOM parser, followed by the classes for exception handling and file I/O.

import org.apache.xerces.parsers.DOMParser; import org.w3c.dom.*; import java.io.*;
2. Next, a constructor is defined for the class (in case you didn't already know, a constructor is a method that is invoked automatically when you create an instance of the class).

// constructor public MyFirstDomApp (String xmlFile) { // create a DOM parser DOMParser parser = new DOMParser(); // parse the document try { parser.parse(xmlFile); Document document = parser.getDocument(); NodeDetails(document); } catch (IOException e) { System.err.println (e); } }
As you can see, the constructor uses the parse() method to perform the actual parsing of the XML document; it accepts the XML file name as method argument. This method call is enclosed within a "try-catch" error handling block, in order to gracefully recover from errors.

The end result of this parsing is a DOM tree consisting of a single root and its child nodes, each of which exposes methods that describe the object in greater detail.

3. The getDocument() method returns an object representing the entire XML document; this object reference is then passed on to the NodeDetails() method to display information about itself, and its children.

// this function prints out information on a specific node // in this example, the "#document" node // it then goes to the next node // and does the same for that private void NodeDetails (Node node) { System.out.println ("Node Type:" + node.getNodeType() + "nNode Name:" + node.getNodeName()); if(node.hasChildNodes()) { System.out.println ("Child Node Type:" + node.getFirstChild().getNodeType() + "nNode Name:" + node.getFirstChild().getNodeName()); } }
4. Once a reference to a node has been obtained, a number of other methods and properties become available to obtain the name and value of that node, as well as references to parent and child nodes. In the code snippet above, I've used the getNodeType() and getNodeName() methods of the Node object to obtain the node type and name respectively. Similarly, the hasChildNodes() method can be used to find out if a node has child nodes under it, while the getFirstChild() method can be used to get a reference to the first child node.

In case you're wondering about the getNodeType() method - every node is of a specific type, and this method returns a numeric and string constant corresponding to the node type. Here's the list of available types:

Type Type Description Name (num) (str)
---------------------------------------------------------------------------
1 ELEMENT_NODE Element The element name
2 ATTRIBUTE_NODE Attribute The attribute name
3 TEXT_NODE Text #text
4 CDATA_SECTION_NODE CDATA #cdata-section
5 ENTITY_REFERENCE_NODE Entity reference The entity reference name
6 ENTITY_NODE Entity The entity name
7 PROCESSING_INSTRUCTION_NODE PI The PI target
8 COMMENT_NODE Comment #comment
9 DOCUMENT_NODE Document #document
10 DOCUMENT_TYPE_NODE DocType Root element
11 DOCUMENT_FRAGMENT_NODE DocumentFragment #document-fragment
12 NOTATION_NODE Notation The notation name

 
 
>>> More XML Articles          >>> More By icarus, (c) Melonfire
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

XML ARTICLES

- Google Docs and Xpath Data Functions
- Flex Array Collection Sort and Filtering
- The Flex Tree Control
- Flex List Controls
- Working with Flex and Datagrids
- How to Set Up Podcasting and Vodcasting
- Creating an RSS Reader Application
- Building an RSS File
- An Introduction to XUL Part 6
- An Introduction to XUL Part 5
- An Introduction to XUL Part 4
- An Introduction to XUL Part 3
- An Introduction to XUL Part 2
- An Introduction to XUL Part 1
- XML Matters: Practical XML Data Design and M...

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: