Sanitizing Input with PHP

In this seventh part of an eight-part article series on using PHP commands with your file and operating systems, you’ll learn how and why to sanitize user input. This article is excerpted from chapter 10 of the book Beginning PHP and PostgreSQL 8: From Novice to Professional, written by W. Jason Gilmore and Robert H. Treat (Apress; ISBN: 1590595475).

System-Level Program Execution

Truly lazy programmers know how to make the most of their entire server environment when developing applications, which includes exploiting the functionality of the operating system, file system, installed program base, and programming languages whenever necessary. In this section, you’ll learn how PHP can interact with the operating system to call both OS-level programs and third-party installed applications. Done properly, it adds a whole new level of functionality to your PHP programming repertoire. Done poorly, it can be catastrophic not only to your application, but also to your server’s data integrity. That said, before delving into this powerful feature, take a moment to consider the topic of sanitizing user input before passing it to the shell level.

Sanitizing the Input

Neglecting to sanitize user input that may subsequently be passed to system-level functions could allow attackers to do massive internal damage to your information store and operating system, deface or delete Web files, and otherwise gain unrestricted access to your server. And that’s only the beginning.


Note See Chapter 21 for a discussion of secure PHP programming.


As an example of why sanitizing the input is so important, consider a real-world scenario. Suppose that you offer an online service that generates PDFs from an input URL. A great tool for accomplishing just this is HTMLDOC, a program that converts HTML documents to indexed HTML, Adobe PostScript, and PDF files. HTMLDOC (http://www.htmldoc.org/) is released under the GNU General Public License. HTMLDOC can be invoked from the command line, like so:

%>htmldoc –webpage -f webpage.pdf http://www.wjgilmore.com/

This would result in the creation of a PDF named webpage.pdf, which would contain a snapshot of the Web site’s index page. Of course, most users will not have command-line access to your server; therefore, you’ll need to create a much more controlled interface to the service, perhaps the most obvious of which being via a Web page. Using PHP’s passthru() function (introduced later in this chapter), you can call HTMLDOC and return the desired PDF, like so:

$document = $_POST['userurl'];
passthru("htmldoc –webpage -f webpage.pdf $document);

What if an enterprising attacker took the liberty of passing through additional input, unrelated to the desired HTML page, entering something like this:

http://www.wjgilmore.com/ ; cd /usr/local/apache/htdocs/; rm -rf *

Most Unix shells would interpret the passthru() request as three separate commands. The first is:

htmldoc –webpage -f webpage.pdf http://www.wjgilmore.com/

The second command is:

cd /usr/local/apache/htdocs/

And the final command is:

rm -rf *

Those last two commands were certainly unexpected, and could result in the deletion of your entire Web document tree. One way to safeguard against such attempts is to sanitize user input before it is passed to any of PHP’s program execution functions. Two standard functions are conveniently available for doing so: escapeshellarg() and escapeshellcmd(). Each is introduced in this section.

{mospagebreak title=Two Sanitizing Commands}

escapeshellarg()

string escapeshellarg (string arguments)

The escapeshellarg() function delimits arguments with single quotes and prefixes (escapes) quotes found within arguments. The effect is that when arguments is passed to a shell command, it will be considered a single argument. This is significant because it lessens the possibility that an attacker could masquerade additional commands as shell command arguments. Therefore, in the aforementioned nightmarish scenario, the entire user input would be enclosed in single quotes, like so:

‘http://www.wjgilmore.com/ ; cd /usr/local/apache/htdoc/; rm -rf *’

The result would be that HTMLDOC would simply return an error, because it could not resolve a URL possessing this syntax, rather than delete an entire directory tree.

escapeshellcmd()

string escapeshellcmd (string command)

The escapeshellcmd() function operates under the same premise as escapeshellarg(), sanitizing potentially dangerous input by escaping shell metacharacters. These characters include the following: # & ; ` , | * ? , ~ < > ^ ( ) [ ] { } $ \.

[gp-comments width="770" linklove="off" ]

antalya escort bayan antalya escort bayan Antalya escort diyarbakir escort