In this seventh part of an eight-part article series on using PHP commands with your file and operating systems, you'll learn how and why to sanitize user input. This article is excerpted from chapter 10 of the book Beginning PHP and PostgreSQL 8: From Novice to Professional, written by W. Jason Gilmore and Robert H. Treat (Apress; ISBN: 1590595475).
Truly lazy programmers know how to make the most of their entire server environment when developing applications, which includes exploiting the functionality of the operating system, file system, installed program base, and programming languages whenever necessary. In this section, you'll learn how PHP can interact with the operating system to call both OS-level programs and third-party installed applications. Done properly, it adds a whole new level of functionality to your PHP programming repertoire. Done poorly, it can be catastrophic not only to your application, but also to your server's data integrity. That said, before delving into this powerful feature, take a moment to consider the topic of sanitizing user input before passing it to the shell level.
Sanitizing the Input
Neglecting to sanitize user input that may subsequently be passed to system-level functions could allow attackers to do massive internal damage to your information store and operating system, deface or delete Web files, and otherwise gain unrestricted access to your server. And that's only the beginning.
Note See Chapter 21 for a discussion of secure PHP programming.
As an example of why sanitizing the input is so important, consider a real-world scenario. Suppose that you offer an online service that generates PDFs from an input URL. A great tool for accomplishing just this is HTMLDOC, a program that converts HTML documents to indexed HTML, Adobe PostScript, and PDF files. HTMLDOC (http://www.htmldoc.org/) is released under the GNU General Public License. HTMLDOC can be invoked from the command line, like so:
This would result in the creation of a PDF named webpage.pdf, which would contain a snapshot of the Web site's index page. Of course, most users will not have command-line access to your server; therefore, you'll need to create a much more controlled interface to the service, perhaps the most obvious of which being via a Web page. Using PHP's passthru() function (introduced later in this chapter), you can call HTMLDOC and return the desired PDF, like so:
Those last two commands were certainly unexpected, and could result in the deletion of your entire Web document tree. One way to safeguard against such attempts is to sanitize user input before it is passed to any of PHP's program execution functions. Two standard functions are conveniently available for doing so: escapeshellarg() and escapeshellcmd(). Each is introduced in this section.