HomePHP Page 5 - Plugging RDF Content Into Your Web Site With PHP
Capture The Flag - PHP
A Web site which dynamically updates itself with the latestnews and information? Nope, it's not as far-fetched as it sounds. Asthis article demonstrates, all you need is a little imagination, acouple of free RDF files and some PHP glue.
The first step in this script is to set up some global variables:
// XML file
$file = "fm-releases.rdf";
// set up some variables for use by the parser
$currentTag = "";
$flag = "";
The $currentTag variable will hold the name of the element
that the parser is currently processing - you'll see why this is needed shortly.
Since my ultimate goal is to display the individual items in the channel, with links, I also need to know when the parser has exited the <channel>...</channel> block and entered the <item>...</item> sections of the document. Since I'm using a SAX parser, which operates in a sequential manner, there is no parser API available to discover depth or location in the tree. So I have to invent my own mechanism to do this - which is where the $flag variable comes in.
The $flag variable will be used to find out if the parser is within the <channel> block or the <item> block.
The next step is to initialize the SAX parser and begin parsing the RSS document.
// create parser
$xp = xml_parser_create();
// set element handler
xml_set_element_handler($xp, "elementBegin", "elementEnd");
xml_set_character_data_handler($xp, "characterData");
xml_parser_set_option($xp, XML_OPTION_CASE_FOLDING, TRUE);
// read XML file
if (!($fp = fopen($file, "r")))
{
die("Could not read $file");
}
// parse data
while ($xml = fread($fp, 4096))
{
if (!xml_parse($xp, $xml, feof($fp)))
{
die("XML parser error: " .
xml_error_string(xml_get_error_code($xp)));
}
}
// destroy parser
xml_parser_free($xp);
This is a fairly standard sequence of commands, and the
comments should explain it sufficiently. The xml_parser_create() function is used to instantiate the parser, and assign it to the handle $xp. Next, callback functions are set up to handle opening and closing tags, and the character data within them. Finally, the xml_parse() function, in conjunction with a bunch of fread() calls, is used to read the RDF file and parse it.
Each time an opening tag is encountered in the document, the opening tag handler elementBegin() is called.
// opening tag handler
function elementBegin($parser, $name, $attributes)
{
global $currentTag, $flag;
// export the name of the current tag to the global scope
$currentTag = $name;
// if within an item block, set a flag
if ($name == "ITEM")
{
$flag = 1;
}
}
This function receives, as function argument, the name of the
current tag and its attributes (if any). This tag name is assigned to the global $currentTag variable, and - if the tag is an opening <item> tag - the $flag variable is set to 1.
Conversely, when a closing tag is found, the closing tag handler elementEnd() is invoked.
// closing tag handler
function elementEnd($parser, $name)
{
global $currentTag, $flag;
$currentTag = "";
// if exiting an item block, print a line and reset the flag
if ($name == "ITEM")
{
echo "<hr>";
$flag = 0;
}
}
This closing tag handler also receives the tag name as
parameter. If this is a closing </item> tag, the value of $flag is reset to 0, and the value of $currentTag is cleared.
Now, what about the data between the tags, which is what we're really interested in? Say hello to the character data handler, characterData().
// character data handler
function characterData($parser, $data)
{
global $currentTag, $flag;
// if within an item block, print item data
if (($currentTag == "TITLE" || $currentTag == "LINK" ||
$currentTag ==
"DESCRIPTION") && $flag == 1)
{
echo "$currentTag: $data <br>";
}
}
Now, if you look at the arguments passed to this function,
you'll see that characterData() only receives the data between the opening and closing tag - it has no idea which particular tag the parser is currently processing. Which is why we needed the global $currentTag variable in the first place (told you this would make sense eventually!)
If the value of $flag is 1 - in other words, if the parser is currently within an <item>...</item> block - and if the element currently being processed is either a <title>, <link> or <description> element, then the data is printed to the output device (in this case, the Web browser), followed by a line break.
The entire RDF document will be processed in this sequential manner, with output appearing every time an item is found. Here's what you'll see when you run the script: