Let's take a closer look at the code from the previous example: Here, I've imported all the classes required to execute the application. First come the classes for the Xerces SAX parser, followed by other classes related to and required for SAX processing and the core Java classes for file I/O and error handling. Along with the set of classes that define the parser, the SAX API also comes equipped with a set of useful interfaces. The one used here is the ContentHandler interface, which defines the callback functions and constants needed for SAX processing. Next, a constructor is defined for the class (in case you don't know, a constructor is a method that is invoked automatically when you create an instance of the class). Once an instance of the parser has been created, the content handler for the parser needs to be defined with the setContentHandler() method. Since the SAXParser class itself implements the ContentHandler interface, it can be transparently used here. Finally, the parse() method handles the actual parsing of the XML document - it accepts the file name as method argument. This method call is enclosed within a "try-catch" error handling block, in order to gracefully recover from errors. In this example, two types of errors have been accounted for: the SAXException error, which is raised when the SAX parser encounters a discrepancy in the XML document (for example, badly-nested tags), and the IOException error, which is raised when a file I/O error occurs. That takes care of the main infrastructure code - but what about the callback functions themselves? In this case, I've only defined a callback for opening elements. This callback function must be named startElement(); it's invoked whenever the parser encounters an opening XML element, and automatically receives the namespace URI, element name, fully qualified name and attributes of the element that triggers it. This data can then be processed and used in whatever manner you desire - over here, I'm simply printing it to the standard output device. A number of other callbacks are also available - however, I've left them to their own devices here. These callbacks handle all the events that the SAX parser generates, providing a wrapper for processing XML documents, elements, character data, PIs and entities. You may be wondering if you really need to define these, since their sum contribution to the functionality of this program is zero. The short answer is, yes, you do; since you're implementing an interface, you must include all the methods within it. If you don't, Java will barf all over your screen - try it and see for yourself! Finally, the main() method sets the ball rolling, instantiating an instance of my user-defined class, with the argument entered by the user (the XML file location) as an input parameter. Next, let's look at streamlining this a little, with a slightly different technique.
blog comments powered by Disqus |