Looping, Security, and Templating Tools

In this second part of a five-part series on templating tools, you’ll learn about loops, arrays, hashes, and more. It is excerpted from chapter three of the book Advanced Perl Programming, Second Edition, written by Simon Cozens (O’Reilly; ISBN: 0596004567). Copyright © 2007 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.

Loops, Arrays, and Hashes

So much for simple templates. Because Text::Template evaluates the code in braces as honest-to-goodness Perl code, we can do a whole lot more with templates. Let’s suppose we’re invoicing for some design work:

  $client = "Acme Motorhomes and Eugenics Ltd.";
  %jobs = 
     
("Designing the new logo" => 450.00,
     
"Letterheads" => 300.00,
     
"Web site redesign"       => 900.00,
     
"Miscellaneous Expenses" => 33.75
 
);

We can create a template to do the work for us–the invoicing work, that is, not the design work:

  {my $total=0; ”}
  To {$client}:

  Thank you for consulting the services of Fungly Foobar Design
  Associates. Here is our invoice in accordance with the work we have
  carried out for you:

  {
    
while (my ($work, $price) = each %jobs) {
       
$OUT .= $work . (" " x (50 – length $work)). sprintf("%6.2f", $price)."n";
       
$total += $price;
   
}
  }

  Total             {sprintf "%6.2f",$total}

  Payment terms 30 days.

  Many thanks,
  Fungly Foobar

What’s going on here? First, we set up a private variable, $total, in the template and set it to zero. However, since we don’t want a 0 appearing at the top of our template, we make sure our code snippet returns so it adds nothing to the output. This is a handy trick.

Next we want to loop over the jobs hash. Adding each price to the total is simple enough, but we also want to add a line to the template for each job. What we’d like to say is something like this:

  {
   
while (my ($work, $price) = each %jobs) {
  }

  {$work}                           {$price}

  {
      $total += $price;
    }
  }

However, Text::Template doesn’t work like that: each snippet of code must be an independent, syntactically correct piece of Perl. So how do we write multiple lines to the template? This is where the magical $OUT variable comes in. If you use $OUT in your template, that’s taken as the output from the code snippet. We can append to this variable each time we go through the loop, and it’ll all be filled into the template at the end.

{mospagebreak title=Security and Error Checking}

One of the advantages of templating is that you can delegate the non-programming bits of your application–design of HTML pages, wording of form letters, and so on–to people who aren’t necessarily programmers. One of the disadvantages with powerful templating systems like Text::Template is that it only takes one joker to discover { system("rm -rf /") } and one or both of you is out of a job. Clearly there needs to be a way to secure your templates against this sort of abuse.

Text::Template offers two ways to protect yourself from this kind of coworker, um, I mean abuse. The first is through Perl’s ordinary tainting mechanism. In taint mode, Perl will refuse to run templates from external files. This protects you from people meddling with the template files, but only because you can’t use template files at all any more; you must specify templates as strings instead.

If you can actually trust the files in the filesystem, then you’ll need to tell Text::Template to untaint the file data; this is done with the UNTAINT option:

  my $template = new Text::Template (TYPE => "FILE",
                                     UNTAINT => 1,
                                     SOURCE => $filename);

Now you will be able to use the template in $filename, if $filename itself has passed taint checks.

The second mechanism is much more fine-grained; the SAFE option allows you to specify a Safe compartment in which to run the code snippets:

  my $compartment = new Safe; # Default set of operations is pretty safe
 
$text = $template->fill_in(SAFE => $compartment);

If you’re really concerned about security, you’ll want to do more tweaking than just using the default set of restricted operations.

What if things go wrong in other ways? You don’t want your application to die if the code snippets contain invalid Perl, or throw a divide-by-zero error. While Text::Template traps eval errors by default, you may find yourself wanting more control of error handling. This is where the BROKEN option comes in.

The BROKEN option allows you to supply a subroutine reference to execute when a code snippet produces a syntax error or fails in any other way. Without BROKEN, you get a default error message inserted into your output:

  Dear Program fragment delivered error “syntax error at template line 1”,

By specifying a BROKEN subroutine, you get more control over what is inserted into the output. In many cases, the only sensible thing to do if your template is broken would be to abort processing of the template altogether. You can do this by returning undef from your BROKEN routine, and Text::Template will return as much output as it was able to build up.

Of course, you now need to be able to tell whether the template completed successfully or whether it was aborted by a BROKEN routine. The way to do this is to use the callback argument BROKEN_ARG. If you pass a BROKEN_ARG to your template constructor, it will be passed into your BROKEN callback.* This allows us to do something like this:

  my $succeeded = 1;

  $template->fill_in(BROKEN => &broken_sub, BROKEN_ARG => $succeeded);

  if (!$suceeded) {
     
die "Template failed to fill in…";
  }

  sub broken_sub {
     
my %params = @_;
     
${$params{arg}} = 0;
     
undef;
 
}

As you can see, the callback is called with a hash; the argument specified by BROKEN_ARG is the arg element of the hash. In this case, that’s a reference to the $succeeded flag; we dereference the reference and set the flag to zero, indicating an error, before returning undef to abort processing.

In case you feel you can make use of the broken template, Text::Template supplies the code snippet as the text element of the hash; I haven’t been able to think of anything sensible to do with this yet. To assist with error reporting, the other entries in the hash are line, the line number in the template where the error occurred, and error, the value of $@ indicating the error.

{mospagebreak title=Text::Template Tricks}

Using { and } to delimit code is fine for most uses of Text::Template–when you’re generating form letters or emails, for instance. But what if you’re generating text that makes heavy use of { and }– HTML pages including JavaScript, for example, or TEX code for typesetting?

One solution is to escape the braces that you don’t want to be processed as Perl snippets with backslashes:

  if (browser == "Opera") {
   

  }

However, as one user pointed out, if you’re generating TeX, which attaches meaning to backslashes and braces, you’re entering a world of pain:

  \textit{ {$title} } \dotfill \textbf{ \${$cost} }

A much nicer solution would be to specify alternate delimiters, and get rid of the backslash escaping:

  textit{ [[[ $title ]]] } dotfill textbf{ [[[ $cost ]]] }

Much clearer!

To do this with Text::Template, use the DELIMITERS option on either the constructor or the fill_in method:

  print $template->fill_in(DELIMITERS => [ '[[[', ']]]’ ]);

This actually runs faster than the default because it doesn’t do any special backslash processing, but needless to say, you have to ensure that your delimiters do not appear in the literal text of your template.

Mark suggests a different trick if this isn’t appropriate: use Perl’s built-in quoting operators to escape the braces. If we have a program fragment
{ q{ Hello } }, this returns the string "Hello" and inserts it into the template output. So another way to get literal text without escaping the braces is simply to add more braces!

  { q{

   if (browser == "Opera") { … }

  } }

Another problem is that your fingers fall off from typing:

  my $template = new Text::Template(…);
  $template->fill_in();

all the time. The object-oriented style is perfect when you have a template that you need to fill in hundreds of times–a form letter, for instance–but not so great if you’re just filling it in once. For these cases, Text::Template can export a subroutine, fill_in_file. This does the preparation and filling in all in one go:

  use Text::Template qw(fill_in_file);

  print fill_in_file("email.tmpl", PACKAGE => "Q", …);

Note that you do have to import this function specifically.

{mospagebreak title=HTML::Template}

HTML formatting is slightly different from plaintext formatting–there are essentially two main schools of thought. The first, used by HTML::Template, is similar to the method we saw in Text::Template; the template is stored somewhere, and a Perl program grabs it and fills it in. The other school of thought is represented by HTML::Mason, which we’ll look at next; this is inside-out–instead of running a Perl program that prints out a load of HTML, you create an HTML file that contains embedded snippets of Perl and run that.

To compare these two approaches, we’re going to build the same application in HTML::Template, HTML::Mason, and Template Toolkit, an aggregator of RSS (Remote Site Summary) feeds to grab headlines from various web sites and push them onto a single page. (Similar to Amphetadesk, http://www.disobey.com/amphetadesk/, and O’Reilly’s Meerkat, http://www.oreillynet.com/meerkat/.) RSS is an XML-based format for providing details of individual items on a site; it’s generally used for providing a feed of stories from news sites.

Variables and Conditions

First, though, we’ll take a brief look at how HTML::Template does its stuff, how to get values into it, and how to get HTML out.

As with Text::Template, templates are specified in separate files. HTML::Template’s templates are ordinary HTML files, but with a few special tags. The most important of these is <TMPL_VAR>, which is replaced by the contents of a Perl variable. For instance, here’s a very simple page:

  <html>
     <head><title>Product details for <TMPL_VAR NAME=PRODUCT></title></head> 
     <body> 
        
<h1> <TMPL_VAR NAME=PRODUCT> </h1>
        <div class="desc">
            
<TMPL_VAR NAME=DESCRIPTION>
        </div>
        <p class="price">Price: $<TMPL_VAR NAME=PRICE></p>
        <hr />
        <p>Price correct as at <TMP_VAR NAME=DATE></p>
    
</body>
  </html>

When filled in with the appropriate details, this should output something like:

  <html>
     <head><title>Product details for World’s Biggest Enchilada</title></head>
     <body>
       
<h1> World’s Biggest Enchilada </h1>
        <div class="desc">
            
Recently discovered in the Mexican rain forests….
        </div>
        <p class="price">Price: $1504.39</p>
        <hr />
        <p>Price correct as at 15:18 PST, 7 Mar 2005</p>
     </body>
  </html>

In order to fill in those values, we write a little CGI program similar to the following one:

  use strict;
  use HTML::Template;

  my $template = HTML::Template->new(filename => "catalogue.tmpl");

  $template->param( PRODUCT     => "World’s Biggest Enchilada" );
  $template->param( DESCRIPTION => $description );
  $template->param( PRICE       => 1504.39 );
  $template->param( DATE        => format_date(localtime) );

  print "Content-Type: text/htmlnn", $template->output;

Again, as with Text::Template, our driver program is very simple–load up the template, fill in the values, produce it. However, there are a few other things we can do with our templating language, and hence there are a few other tags that allow us a little more flexibility.

For instance, suppose we happen to have a picture of the world’s biggest enchilada–that would be something worth putting on our web page. However, we don’t have pictures for everything in the database; we want to output a pictures section only if we actually do have an image file kicking about. So, we could add something like this to our template:

  <TMPL_IF NAME=PICTURE_URL>
  <div class="photo">
     <img src="<TMP_VALUE NAME=PICTURE_URL>" />
  </div>
  </TMPL_IF>

This means that if PICTURE_URL happens to have a true value–that is, if we’ve given it something like a real URL–then we include the photo <DIV>. As these <TMPL_…> tags are not real HTML tags, only things processed by HTML::Template, it’s not a problem to stick one in the middle of another HTML tag, as we have here with <IMG SRC="…">.

Of course, if we don’t have a picture, we might want to stick another one in its place, which we can do with the <TMPL_ELSE> pseudotag:

  <div class="photo">
  <TMPL_IF NAME=PICTURE_URL>
     <img src="<TMP_VALUE NAME=PICTURE_URL>" />
  <TMPL_ELSE>
    
<img src="http://www.mysite.com/images/noimage.gif" />
  </TMPL_IF>
  </div>

Notice that although our <TMPL_IF> must be matched by a </TMPL_IF>, <TMPL_ELSE> is not matched.

But perhaps we’re being unduly complex; all we need in this example is a default value for our PICTURE_URL, and we can do this directly with a DEFAULT attribute to <TMPL_VALUE>:

  <div class="photo">
     <img src="
  <TMPL_VALUE NAME=PICTURE_URL 
             DEFAULT="http://www.mysite.com/
images/noimage.gif">
     "/>
  </div>


Validation

Some people worry, quite rightly, about the effect that this sort of indiscriminate SGML abuse has on checking templates for validity. (Although, sadly many more people don’t worry about HTML validity.) Further, those who use DTD-aware validating editors might wonder how to get these pseudotags into their documents in a nice way.

HTML::Template has a way around this; instead of writing the tags as though they were ordinary HTML tags, you can also write them as though they were comments, like so:

  <!– TMPL_IF NAME=PICTURE_URL –>
  <div class="photo">
     <img src="<!– TMP_VALUE NAME=PICTURE_URL –>" />
  </div>
  <!– /TMPL_IF –>


Please check back next week for the continuation of this article.

Google+ Comments

Google+ Comments