Web Services: SimpleXML

In this third part of a five-part series on Web Services, we’ll wrap up our discussion of MagpieRSS and move on to SimpleXML, an intuitive methodology for processing XML structures. This article is excerpted from chapter 20 of the book Beginning PHP and Oracle: From Novice to Professional, written by W. Jason Gilmore and Bob Bryla (Apress; ISBN: 1590597702).

Limiting the Number of Displayed Headlines

Some Web site developers are so keen on RSS that they wind up dumping quite a bit of information into their published feeds. However, you might be interested in viewing only the most recent items and ignoring the rest. Because Magpie relies heavily on standard PHP language features such as arrays and objects for managing RSS data, limiting the number of headlines is trivial because you can call upon one of PHP’s default array functions for the task. The function array_slice() should do the job quite nicely. For example, suppose you want to limit total headlines displayed for a given feed to three. You can use array_slice() to truncate it prior to iteration, like so:

$rss->items = array_slice($rss->items, 0, 3);

Caching Feeds

One final topic to discuss regarding Magpie is its caching feature. By default, Magpie caches feeds for 60 minutes, on the premise that the typical feed will likely not be updated more than once per hour. Therefore, even if you constantly attempt to retrieve the same feeds, say once every 5 minutes, any updates will not appear until the cached feed is at least 60 minutes old. However, some feeds are published more than once an hour, or the feed might be used to publish somewhat more pressing information. (RSS feeds don’t necessarily have to be used for browsing news headlines; you could use them to publish information about system health, logs, or any other data that could be adapted to its structure. It’s also possible to extend RSS as of version 2.0, but this matter is beyond the scope of this book.) In such cases, you may want to consider modifying the default behavior.

To completely disable caching, disable the constant MAGPIE_CACHE_ON , like so:

define(‘MAGPIE_CACHE_ON’, 0);

To change the default cache time (measured in seconds), you can modify the constant MAGPIE_CACHE_AGE , like so:

define(‘MAGPIE_CACHE_AGE’,1800);

Finally, you can opt to display an error instead of a cached feed in the case that the fetch fails by enabling the constant MAGPIE_CACHE_FRESH_ONLY :

define(‘MAGPIE_CACHE_FRESH_ONLY’, 1)

You can also change the default cache location (by default, the same location as the executing script) by modifying the MAGPIE_CACHE_DIR constant:

define(‘MAGPIE_CACHE_DIR’, ‘/tmp/magpiecache/’);

{mospagebreak title=SimpleXML} 

Everyone agrees that XML signifies an enormous leap forward in data management and application interoperability. Yet how come it’s so darned hard to parse? Although powerful parsing solutions are readily available, DOM, SAX, and XSLT to name a few, each presents a learning curve that is just steep enough to cause considerable gnashing of the teeth among those users interested in taking advantage of XML’s practicalities without an impractical time investment. Leave it to an enterprising PHP developer (namely, Sterling Hughes) to devise a graceful solution. SimpleXML offers users a very practical and intuitive methodology for processing XML structures and is enabled by default as of PHP 5. Parsing even complex structures becomes a trivial task, accomplished by loading the document into an object and then accessing the nodes using field references, as you would in typical object-oriented fashion.

The XML document displayed in Listing 20-4 is used to illustrate the examples offered in this section.

Listing 20-4. A Simple XML Document

<?xml version="1.0" standalone="yes"?>
<library>
   <book>
     
<title>Pride and Prejudice</title>
     
<author gender="female">Jane Austen</author>
     
<description>Jane Austen’s most popular work.</description>
   </book>
   <book>
      <title>The Conformist</title>
      <author gender="male">Alberto Moravia</author>
      <description>Alberto Moravia’s classic psychological novel.</description>
   </book>
  
<book>
      <title>The Sun Also Rises</title>
      <author gender="male">Ernest Hemingway</author>
      <description>The masterpiece that launched Hemingway’s
      career.</description>
  
</book>
</library>

{mospagebreak title=Loading XML}

A number of SimpleXML functions are available for loading and parsing the XML document. These functions are introduced in this section, along with several accompanying examples.


Note  To take advantage of SimpleXML when using PHP versions older than 6.0, you need to disable the PHP directive zend.ze1_compatibility_mode .  


Loading XML from a File

The simplexml_load_file() function loads an XML file into an object. Its prototype follows:

object simplexml_load_file(string filename [, string class_name])

If a problem is encountered loading the file, FALSE is returned. If the optional class_name parameter is included, an object of that class will be returned. Of course, class_name should extend the SimpleXMLElement class. Consider an example:

<?php
   
$xml = simplexml_load_file("books.xml");
   
var_dump($xml);
?>

This code returns the following:

object(SimpleXMLElement)#1 (1) {
  ["book"]=>
  array(3) {
   
[0]=>
   
object(SimpleXMLElement)#2 (3) {
      ["title"]=>
      string(19) "Pride and Prejudice"
      ["author"]=>
      string(11) "Jane Austen"
      ["description"]=>
      string(32) "Jane Austen’s most popular work."
   
}
    [1]=>
    object(SimpleXMLElement)#3 (3) {
      ["title"]=>
      string(14) "The Conformist"
      ["author"]=>
      string(15) "Alberto Moravia"
      ["description"]=>
      string(46) "Alberto Moravia’s classic psychological novel."
   
}
    [2]=>
    object(SimpleXMLElement)#4 (3) {
      ["title"]=>
      string(18) "The Sun Also Rises"
      ["author"]=>
      string(16) "Ernest Hemingway"
      ["description"]=>
      string(55) "The masterpiece that launched Hemingway’s
      career." 
   
}
  }
}

Note that dumping the XML will not cause the attributes to show. To view attributes, you need to use the attributes() method, introduced later in this section.

Loading XML from a String

If the XML document is stored in a variable, you can use the simplexml_load_string() function to read it into the object. Its prototype follows:

object simplexml_load_string(string data)

This function is identical in purpose to simplexml_load_file() , except that the lone input parameter is expected in the form of a string rather than a file name.

Loading XML from a DOM

The Document Object Model (DOM) is a W3C specification that offers a standardized API for creating an XML document, and subsequently navigating, adding, modifying, and deleting its elements. PHP provides an extension capable of managing XML documents using this standard, titled the DOM XML extension. You can use the simplexml_import_dom() function to convert a node of a DOM document into a SimpleXML node, subsequently exploiting use of the SimpleXML functions to manipulate that node. Its prototype follows:

object simplexml_import_dom(domNode node)

{mospagebreak title=Parsing the XML}

Once an XML document has been loaded into an object, several methods are at your disposal. Presently, four methods are available, each of which is introduced in this section.

Learning More About an Element

XML attributes provide additional information about an XML element. In the sample XML document in the previous Listing 20-4, only the author node possesses an attribute, namely gender , used to offer information about the author’s gender. You can use the attributes() method to retrieve these attributes. Its prototype follows:

object simplexml_element->attributes()

For example, suppose you want to retrieve the gender of each author:

<?php
   
$xml = simplexml_load_file("books.xml");
   
foreach($xml->book as $book) {
       
printf("%s is %s. <br />",$book->author, $book->author->attributes());
   
}
?>

This example returns the following:

——————————————–
Jane Austen is female.
Alberto Moravia is male.
Ernest Hemingway is male.
——————————————–

You can also directly reference a particular book author’s gender. For example, suppose you want to determine the gender of the author of the second book in the XML document:

echo $xml->book[2]->author->attributes();

This example returns the following:

——————————————–
male
——————————————–

Often a node possesses more than one attribute. For example, suppose the author node looks like this:

<author gender="female" age="20">Jane Austen</author>

It’s easy to output the attributes with a for loop:

foreach($xml->book[0]->author->attributes() AS $a => $b) {
    printf("%s = %s <br />", $a, $b);
}

This example returns the following:

——————————————–
gender = female
age = 20
——————————————–

Please check back next week for the continuation of this article.

[gp-comments width="770" linklove="off" ]

chat