Home arrow PHP arrow Parsing Web Document Nodes with the Tidy Library in PHP 5

Parsing Web Document Nodes with the Tidy Library in PHP 5

Writing well-formatted (X)HTML code to include in the presentation layers of certain PHP applications can be an annoying and time-consuming process for many web developers. However, the Tidy extension that comes integrated with PHP 5 can turn this ugly task into a pleasant experience. Keep reading to learn how.

TABLE OF CONTENTS:
  1. Parsing Web Document Nodes with the Tidy Library in PHP 5
  2. Parsing and formatting basic (X)HTML code with Tidy
  3. Using the tidy_get_html() and tidy_get_head() functions
  4. Using the tidy_get_body() and tidy_get_ouput() functions
By: Alejandro Gervasio
Rating: starstarstarstarstar / 4
July 03, 2007

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

Introduction

Welcome to the second tutorial of the series that began with "Working with the Tidy Library in PHP 5." Made up of three instructive articles, this series steps you through using the most important functions bundled with this powerful library, and complements the corresponding theory with illustrative hands-on examples.

If you already read the first installment of the series, then it's quite possible that you find the Tidy extension very familiar, since its remarkable capacity for parsing and formatting (X)HTML markup is accompanied by an extremely easy learning curve. True to form, Tidy comes equipped with a decent arsenal of functions (or method and properties, if you're using an object-based syntax) that allows you to correct the format of any web document in a few simple steps.

And speaking of performing simple tasks, certainly you'll recall that in the first article of the series I discussed how to parse and format several basic (X)HTML documents, by using some straightforward functions bundled with this library, such as "tidy_parse_file()," "tidy_repair_file()" and "tidy_parse_string()."

As you learned in that tutorial, repairing badly-formatted web documents is actually an effortless process with the assistance of the Tidy extension. Thus, based upon the fact that Tidy has much more to offer when it comes to parsing and fixing (X)HTML code, in this second article of the series I'm going to discuss how to extract different sections of a specific (X)HTML document (called file nodes) by using the capabilities provided by some additional functions included with this library.

At the end of this tutorial you'll be equipped with the required background to dissect the principal nodes of a concrete (X)HTML file with the help of some easy-to-follow Tidy functions.

So, are you ready to explore some more useful features integrated with the Tidy extension? Okay, let's begin this journey now!



 
 
>>> More PHP Articles          >>> More By Alejandro Gervasio
 

blog comments powered by Disqus
   

PHP ARTICLES

- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...
- PHP: Building Concrete Validators
- Sanitizing Input with PHP
- Executing Shell Commands with PHP
- Handling File Data with PHP
- File Security and Resources with PHP
- ArrayObject PHP Class Examples
- ArrayObject PHP Class: An Introduction
- Getting File System Data with PHP
- PHP Tools for Working with the File and Oper...
- Working with the File and Operating System w...
- PHP Proxy Patterns: Completing a Blog


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 6 - Follow our Sitemap

Dev Shed Tutorial Topics: