HomePython Page 5 - Writing CGI Programs in Python
Defining a useful Display function - Python
And now for something completely different... Python has a very extensive, well documented and portable module library that provides a large variety of useful functions. The Internet-related collection is particularly impressive, with modules to deal with everything from parsing and retrieving URL's to retrieving mail from POP servers and everything in between.
Since we've decided to build our Python application incrementally ("Oh have we?" you ask) rather than from some kind of master scheme, let's go ahead and define a function that might actually be useful to us. By the time this article series is complete, you will see how and why it is best to design an overall object framework for all but the simplest applications, rather than design from the ground-up like we're doing here for the purposes of instruction.
We'll need a routine to actually display the content that we're going to assemble. This routine should take care of administrivia like the Content-Type line and basic HTML syntax like the <BODY></BODY> tags. Also, ideally, we want as little actual HTML in our Python program as possible. In fact, we want all this formatting stuff in an external file that can be easily modified and updated without delving into actual program code.
Your site probably has a standard HTML "style sheet" (even if you're not using actual CSS) that you're expected to follow. It would be a real pain to embed this HTML directly into your code, only to have to change on a daily basis. There is also the danger of accidentally modifying important program code.
So, we'll go ahead and define a simple template file that your Python scripts' Display() function will use. Your actual template will probably be much more complicated but this will get you started.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<META NAME="keywords" CONTENT="blah blah -- your ad here">
<title>Python is Fun!</title>
<!-- *** INSERT CONTENT HERE *** -->
The thing that should (hopefully) strike you about this template is the HTML comment <!-- *** INSERT CONTENT HERE *** -->. Your Python program will search for this comment and replace it with its goodies.
To perform this feat of minor magic, we will introduce a couple of new methods including file operations and regular expressions. Old Perl Hackers will rejoice to learn that Python includes complete and powerful regular expression (RE) support. The actual RE syntax is almost identical to Perl's, but it is not built directly into the language (it is implemented as a module.)
Those of you who are not Old Perl Hackers are probably wondering at this point what exactly are regular expressions. A regular expression is like a little mini-language that specifies a precise set of text strings that match, or agree, with it. That set can be unlimited or finite in size depending on the regexp. When Python scans a string for a given regular expression, it searches the entire string for characters that exactly match the RE. Then, depending on the operation, it will either simply note the location or perform a substitution. RE's are extraordinarily powerful in a wide range of applications, but we will only be using a few basic RE features here.
The basic tactic we're going to use to implement template support in our CGI script is to read the entire template file in as one long string, then do a regular expression substitution, swapping our content for that "INSERT CONTENT HERE" marker.
Here is the code for our Display(Content) function:
import re # make the regular expression module available
# specify the filename of the template file
TemplateFile = "template.html"
# define a new function called Display
# it takes one parameter - a string to Display
TemplateHandle = open(TemplateFile, "r") # open in read only mode
# read the entire file as a string
TemplateInput = TemplateHandle.read()
TemplateHandle.close() # close the file
# this defines an exception string in case our
# template file is messed up
BadTemplateException = "There was a problem with the HTML template."
SubResult = re.subn("<!-- *** INSERT CONTENT HERE *** -->",
if SubResult == 0:
print "Content-Type: text/html\n\n"
Let's review some of the important features of this code snippet. First, reading the contents of a file into a string is fairly unremarkable.
Next we define a string called BadTemplateException that will be used in case the template file is corrupt. If the marker HTML comment cannot be found, this string will be 'raised' as a fatal exception and stop the execution of the script. This exception is fatal because it's not caught by an 'except' block. Also, if the template file simply can't be found by the open() method, Python itself will raise an (unhandled, fatal) exception. All the user will likely see is an "Internal Server Error" message. The traceback (what caused the exception) will be recorded in the web server's logs. We'll go into more detail later about exceptions and how to handle them intelligently.
Next, we have the actual meat of this function. This line uses the subn() method of the re (regular expression) module to substitute our Content string for the marker defined in the first parameter to the method. Notice that we had to backslash the * characters, because these have special meaning in RE's. (See the documentation for the re module for a complete list of special characters.) Our RE is relatively simple; it is simply an exact list of characters with no special attributes.
The second parameter is our Content, the thing we want to replace the marker with. This could be another function as long as that function returns a string (otherwise, you'll get an exception.) The final parameter is the string to operate on, in this case, the entire template file contained in TemplateInput.
re.subn() returns a tuple with the resulting string and a number indicating how many substitutions were made. A tuple is another fundamental Python data type. It is almost exactly like a list except that it can't be changed; it is what Python calls "immutable." (From this we can deduce that lists can be modified "on the fly.") It is an array that is accessed by number subscripts. SubResult contains the modified result string (remember, programmers start counting from 0!) and SubResult contains the number of substitutions actually made.
This line also demonstrates the method of breaking up long lines. If you put a single character at the end of a line, the next line is considered a continuation of the same line. It's useful for making code readable, or fitting Python code into narrow HTML browser windows.
The next line of code says that if there weren't any substitutions made, this is a problem, so complain loudly by raising an (unhandled, fatal) exception. In other words, if our special "INSERT CONTENT HERE" marker was not found in the template file, this is a big problem!
Finally, since everything is okay at this point, go ahead and print out the special Content-Type line and then our complete string, with our Content inside our Template.
You can use this function in your own CGI Python programs. Call it when you've assembled a complete string containing all the special output of your program and you're ready to display it.
Later, we will modify this function to handle special types of Content data including objects that contain sub-objects that contain sub-objects and so on.