By separating content from presentation, XML offers Web developers a powerful alternative to traditional HTML technology...and when you combine that with PHP, you have a truly compelling new set of tools. In this article, find out how PHP's SAX parser can be used to parse XML data and generate HTML Web pages.
The first order of business is to initialize the XML parser, and set up the callback functions.
<?
// data file
$file = "library.xml";
// initialize parser
$xml_parser
= xml_parser_create();
// set callback functions
xml_set_element_handler($xml_parser,
"startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
//
open XML file
if (!($fp = fopen($file, "r")))
{
die("Cannot locate XML data
file: $file");
}
// read and parse data
while ($data = fread($fp, 4096))
{
// error handler
if (!xml_parse($xml_parser, $data, feof($fp)))
{
die(sprintf("XML
error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
// clean up
xml_parser_free($xml_parser);
?>
The xml_parser_create() function is used to initialize the parser, and assign
it to a handle, which may be used by subsequent function calls. Once the parser is initialized, the next step is to specify callback functions for the different types of tags.
The xml_set_element_handler() function is used to specify the functions to be executed when the parser encounters the opening and closing tags of an element - in this case, the functions startElement() and endElement() respectively. The xml_set_character_data_handler() function specifies the function to be called when the parser encounters character data - in this case, characterData().
In addition, PHP also offers the xml_set_processing_instruction_handler(), for processing instructions; xml_set_unparsed_entity_decl_handler(), for unparsed entities; xml_set_external_entity_ref_handler(), for external entities; xml_set_notation_decl_handler(), for notation declarations; and xml_set_default_handler(), for all other entities within a document. In case you don't know what any of these are, don't worry about it - I'm not planning to use any of them here.
The next step in the script above is to open the XML file (as defined in the $file variable), read it and parse it via the xml_parse() function. The xml_parse() function will call the appropriate handling function each time it encounters a specific tag type. Once the document has been completely parsed, the xml_parser_free() function is called to free used memory and clean things up.
Errors can be displayed by means of the xml_error_string() function, which returns a description of the error encountered by the parser. Additional functions like xml_get_error_code(), xml_get_current_line_number(), xml_get_current_column_number(), and xml_get_current_byte_index() provide some additional information on the error.