HomePHP Page 4 - Working with the Tidy Library in PHP 5
Using the tidy_parse_file() and tidy_repair_file() functions - PHP
As a PHP developer, you've probably developed database-driven applications that deliver their contents in (X)HTML format to the end user. If so, you know that when you work directly with hard-coded (X)HTML files, you risk forgetting to close tags and DTD headers, making the process annoying and time-consuming. Keep reading; help is on the way.
In consonance with the concepts deployed in the prior section, you already saw that the Tidy extension comes equipped with a set of functions for parsing and fixing specified (X)HTML strings. In a similar fashion, this powerful PHP package also includes a pair of additional functions, called "tidy_parse_file()" and "tidy_repair_file()" respectively, which are helpful for parsing and correcting badly-formatted (X)HTML files.
Even when the difference between parsing files or plain strings may seem subtle from a theoretical point of view, in practical terms, it may be rather relevant, particularly in those cases where your PHP applications need to deal with (X)HTML markup stored on text files.
Having clarified this important point, please pay attention to the following pair of code samples, which demonstrate a simple implementation for the aforementioned Tidy functions. Here's how the examples look:
// example of 'tidy_parse_file()' function
// definition of (target_file.htm) <html> <head> <title>This file will be parsed by Tidy</title> </head> <body> <p>This is an erroneous line <p>This is another erroneous line</i> </body> </html>
// example of 'tidy_repair_file()' function $brokenFile='target_file.htm'; $fixedFile=tidy_repair_file($brokenFile); if(!file_put_contents($brokenFile,$fixedFile)){ trigger_error('Error putting fixed contents on target file',E_USER_ERROR); }
As demonstrated above, the first example shows how to take advantage of the capacity provided by the "tidy_parse_file()" function to interpret the markup of a concrete (X)HTML file, which is finally corrected via the already familiar "cleanRepair()" method that you learned in a previous section.
The second case illustrates a simple implementation of the "tidy_repair_file()" function, which comes in very convenient for reading and fixing the contents of a concrete (X)HTML file in only one step.
At this stage, and after analyzing in detail the pair of hands-on examples listed a few lines above, hopefully you'll have a much better idea of how to use (at least basically) some of the most useful functions that come integrated with the Tidy extension in PHP 5.
Of course, this tutorial is intended to be a simple introduction to the main features provided by this excellent PHP 5 package. If you're searching for a more detailed reference on Tidy's functions and properties, the PHP official site is the best place to look.
Final thoughts
Unfortunately, we've come to the end of this first article of the series. As you saw in this tutorial, I walked you though the basic concepts of using the Tidy extension in conjunction with PHP 5. Nevertheless, this instructive journey is just beginning, since there are many other useful Tidy functions that still need to be properly reviewed, so you can acquire a more solid background in this helpful PHP extension.
In the next installment of the series I'm going to teach you how to use some additional functions included with the Tidy library for extracting specific nodes of a given (X)HTML document.
Now that you know what to expect from the next part, you won't want to miss it!