HomePHP Page 4 - Plugging RDF Content Into Your Web Site With PHP
Fresh Meat - PHP
A Web site which dynamically updates itself with the latestnews and information? Nope, it's not as far-fetched as it sounds. Asthis article demonstrates, all you need is a little imagination, acouple of free RDF files and some PHP glue.
Since RSS is, technically, well-formed XML, it can be processed using standard XML programming techniques. There are two primary techniques here: SAX (the Simple API for XML) and DOM (the Document Object Model).
A SAX parser works by traversing an XML document and calling specific functions as it encounters different types of tags. So, for example, I might call a specific function to process a starting tag, another function to process an ending tag, and a third function to process the data between them. The parser's responsibility is simply to sequentially traverse the document; the functions it calls are responsible for processing the tags found. Once the tag is processed, the parser moves on to the next element in the document, and the process repeats itself.
A DOM parser, on the other hand, works by reading the entire XML document into memory, converting it into a hierarchical tree structure, and then providing an API to access different tree nodes (and the content attached to each). Recursive processing, together with API functions that allow developers to distinguish between different types of nodes (element, attribute, character data, comment et al), make it possible to perform different actions depending on both node type and node depth within the tree.
SAX and DOM parsers are available for almost every language, including your favourite and mine, PHP. I'll be using PHP's SAX parser to process the RDF examples in this article; however, it's just as easy to perform equivalent functions using the DOM parser.
Let's look at a simple example to put all this in perspective. Here's the RDF file I'll be using, culled directly from http://www.freshmeat.net/ :
<?xml version="1.0" encoding="ISO-8859-1"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://purl.org/rss/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<channel rdf:about="http://freshmeat.net/">
<title>freshmeat.net</title>
<link>http://freshmeat.net/</link>
<description>freshmeat.net maintains the Web's largest index of Unix
and cross-platform open source software. Thousands of applications are
meticulously cataloged in the freshmeat.net database, and links to new
code are added daily.</description>
<dc:language>en-us</dc:language>
<dc:subject>Technology</dc:subject>
<dc:publisher>freshmeat.net</dc:publisher>
<dc:creator>freshmeat.net contributors</dc:creator>
<dc:rights>Copyright (c) 1997-2002 OSDN</dc:rights>
<dc:date>2002-02-11T10:20+00:00</dc:date>
<items>
<rdf:Seq>
<rdf:li rdf:resource="http://freshmeat.net/releases/69583/" />
<rdf:li rdf:resource="http://freshmeat.net/releases/69581/" />
<!-- and so on ->
</rdf:Seq>
</items>
<image rdf:resource="http://freshmeat.net/img/fmII-button.gif" />
<textinput rdf:resource="http://freshmeat.net/search/" />
</channel>
<image rdf:about="http://freshmeat.net/img/fmII-button.gif">
<title>freshmeat.net</title>
<url>http://freshmeat.net/img/fmII-button.gif</url>
<link>http://freshmeat.net/</link>
</image>
<item rdf:about="http://freshmeat.net/releases/69583/">
<title>sloop.splitter 0.2.1</title>
<link>http://freshmeat.net/releases/69583/</link>
<description>A real time sound effects program.</description>
<dc:date>2002-02-11T04:52-06:00</dc:date>
</item>
<item rdf:about="http://freshmeat.net/releases/69581/">
<title>apacompile 1.9.9</title>
<link>http://freshmeat.net/releases/69581/</link>
<description>A full-featured Apache compilation HOWTO.</description>
<dc:date>2002-02-11T04:52-06:00</dc:date>
</item>
<!-- and so on ->
</rdf:RDF>
And here's the PHP script to parse it and display the data
within it:
<?php
// XML file
$file = "fm-releases.rdf";
// set up some variables for use by the parser
$currentTag = "";
$flag = "";
// create parser
$xp = xml_parser_create();
// set element handler
xml_set_element_handler($xp, "elementBegin", "elementEnd");
xml_set_character_data_handler($xp, "characterData");
xml_parser_set_option($xp, XML_OPTION_CASE_FOLDING, TRUE);
// read XML file
if (!($fp = fopen($file, "r")))
{
die("Could not read $file");
}
// parse data
while ($xml = fread($fp, 4096))
{
if (!xml_parse($xp, $xml, feof($fp)))
{
die("XML parser error: " .
xml_error_string(xml_get_error_code($xp)));
}
}
// destroy parser
xml_parser_free($xp);
// opening tag handler
function elementBegin($parser, $name, $attributes)
{
global $currentTag, $flag;
// export the name of the current tag to the global scope
$currentTag = $name;
// if within an item block, set a flag
if ($name == "ITEM")
{
$flag = 1;
}
}
// closing tag handler
function elementEnd($parser, $name)
{
global $currentTag, $flag;
$currentTag = "";
// if exiting an item block, print a line and reset the flag
if ($name == "ITEM")
{
echo "<hr>";
$flag = 0;
}
}
// character data handler
function characterData($parser, $data)
{
global $currentTag, $flag;
// if within an item block, print item data
if (($currentTag == "TITLE" || $currentTag == "LINK" ||
$currentTag ==
"DESCRIPTION") && $flag == 1)
{
echo "$currentTag: $data <br>";
}
}
?>
Don't get it yet? The next page has an
explanation.