Home arrow PHP arrow Page 2 - Working with the Tidy Library in PHP 5

Parsing (X)HTML strings - PHP

As a PHP developer, you've probably developed database-driven applications that deliver their contents in (X)HTML format to the end user. If so, you know that when you work directly with hard-coded (X)HTML files, you risk forgetting to close tags and DTD headers, making the process annoying and time-consuming. Keep reading; help is on the way.

TABLE OF CONTENTS:
  1. Working with the Tidy Library in PHP 5
  2. Parsing (X)HTML strings
  3. Implementing the tidy_clean_repair() function
  4. Using the tidy_parse_file() and tidy_repair_file() functions
By: Alejandro Gervasio
Rating: starstarstarstarstar / 3
June 26, 2007

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

As I stated in the beginning of this article, the Tidy library can be really useful in those cases where a specific section of (X)HTML has been badly formatted and, in consequence, it needs to be fixed quickly.

With reference to performing this code-correcting process, Tidy is packaged with a neat set of format-cleaning functions, starting with the one called "tidy_parse_string()" whose implementation is demonstrated by the example below:

<?php
// example of 'tidy_parse_string()' function
ob_start();
?>
<html>
  <head>
   <title>This file will be parsed by Tidy</title>
  </head>
  <body>
   <p>This is an erroneous line
   <p>This is another erroneous line</i>
  </body>
</html>
<?php
$fileContents=ob_get_clean();
$params=array('indent'=>TRUE,'output-xhtml'=>TRUE,'wrap'=>200);
$tidy=tidy_parse_string($fileContents,$params,'UTF8');
$tidy->cleanRepair();
echo $tidy;

/* displays the following:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>
        This file will be parsed by Tidy
    </title>
  </head>
  <body>
    <p>This is an erroneous line</p>
    <p>This is another erroneous line</p>
  </body>
</html>
*/

As you can see, there are some important things to note with reference to the above example. First, I used the previously mentioned "tidy_parse_string()" function along with a few simple formatting input parameters to logically parse a specified (X)HTML string. In this case, the string in question has been placed into an output buffer, and then interpreted. This condition can be easily modified, however -- for instance, to read the respective data via a native PHP function.

Besides, you should notice that the "tidy_parse_string()" function returns to client code a new "Tidy" object, which has a bunch of methods and properties that can be really useful to perform a variety of tasks, including the correction of missing and erroneous tags. The previous example shows how to format properly the prior (X)HTML string via the "cleanRepair()" method.

And finally, you can see that the sample string has been fixed, not only by correcting its erroneous <p> and </li> tags, but adding on top of it a DTD statement. Undoubtedly, after studying the previous code sample, you'll have to agree with me that using the Tidy extension with PHP 5 is indeed a no-brainer process, right?

Okay, at this point I showed you how to use the "tidy_parse_string()" function, with the purpose of parsing and formatting correctly some basic (X)HTML markup. However, Tidy has another function called "tidy_clean_repair()," which as you'll see in a moment, can also be helpful for repairing badly-formatted (X)HTML strings.

To learn how this brand new Tidy function will be implemented, please jump into the following lines and keep reading.



 
 
>>> More PHP Articles          >>> More By Alejandro Gervasio
 

blog comments powered by Disqus
   

PHP ARTICLES

- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...
- PHP: Building Concrete Validators
- Sanitizing Input with PHP
- Executing Shell Commands with PHP
- Handling File Data with PHP
- File Security and Resources with PHP
- ArrayObject PHP Class Examples
- ArrayObject PHP Class: An Introduction
- Getting File System Data with PHP
- PHP Tools for Working with the File and Oper...
- Working with the File and Operating System w...
- PHP Proxy Patterns: Completing a Blog


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 11 - Follow our Sitemap

Dev Shed Tutorial Topics: