Home arrow PHP arrow Parsing Web Document Nodes with the Tidy Library in PHP 5

Parsing Web Document Nodes with the Tidy Library in PHP 5

Writing well-formatted (X)HTML code to include in the presentation layers of certain PHP applications can be an annoying and time-consuming process for many web developers. However, the Tidy extension that comes integrated with PHP 5 can turn this ugly task into a pleasant experience. Keep reading to learn how.

  1. Parsing Web Document Nodes with the Tidy Library in PHP 5
  2. Parsing and formatting basic (X)HTML code with Tidy
  3. Using the tidy_get_html() and tidy_get_head() functions
  4. Using the tidy_get_body() and tidy_get_ouput() functions
By: Alejandro Gervasio
Rating: starstarstarstarstar / 4
July 03, 2007

print this article




Welcome to the second tutorial of the series that began with "Working with the Tidy Library in PHP 5." Made up of three instructive articles, this series steps you through using the most important functions bundled with this powerful library, and complements the corresponding theory with illustrative hands-on examples.

If you already read the first installment of the series, then it's quite possible that you find the Tidy extension very familiar, since its remarkable capacity for parsing and formatting (X)HTML markup is accompanied by an extremely easy learning curve. True to form, Tidy comes equipped with a decent arsenal of functions (or method and properties, if you're using an object-based syntax) that allows you to correct the format of any web document in a few simple steps.

And speaking of performing simple tasks, certainly you'll recall that in the first article of the series I discussed how to parse and format several basic (X)HTML documents, by using some straightforward functions bundled with this library, such as "tidy_parse_file()," "tidy_repair_file()" and "tidy_parse_string()."

As you learned in that tutorial, repairing badly-formatted web documents is actually an effortless process with the assistance of the Tidy extension. Thus, based upon the fact that Tidy has much more to offer when it comes to parsing and fixing (X)HTML code, in this second article of the series I'm going to discuss how to extract different sections of a specific (X)HTML document (called file nodes) by using the capabilities provided by some additional functions included with this library.

At the end of this tutorial you'll be equipped with the required background to dissect the principal nodes of a concrete (X)HTML file with the help of some easy-to-follow Tidy functions.

So, are you ready to explore some more useful features integrated with the Tidy extension? Okay, let's begin this journey now!

>>> More PHP Articles          >>> More By Alejandro Gervasio

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort


- Hackers Compromise PHP Sites to Launch Attac...
- Red Hat, Zend Form OpenShift PaaS Alliance
- PHP IDE News
- BCD, Zend Extend PHP Partnership
- PHP FAQ Highlight
- PHP Creator Didn't Set Out to Create a Langu...
- PHP Trends Revealed in Zend Study
- PHP: Best Methods for Running Scheduled Jobs
- PHP Array Functions: array_change_key_case
- PHP array_combine Function
- PHP array_chunk Function
- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...

Developer Shed Affiliates


Dev Shed Tutorial Topics: