Home arrow Perl Programming arrow Page 2 - Templating Tools

Formats and Text::Autoformat - Perl

You may have created your own templating system in Perl to meet certain project requirements, but did you know there is a better way? This article, the first in a five-part series, explores your options. It is excerpted from chapter three of Advanced Perl Programming, Second Edition, written by Simon Cozens (O'Reilly; ISBN: 0596004567). Copyright 2007 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

TABLE OF CONTENTS:
  1. Templating Tools
  2. Formats and Text::Autoformat
  3. Text::Autoformat
  4. Text::Template
By: O'Reilly Media
Rating: starstarstarstarstar / 2
August 07, 2008

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

Formats have been in Perl since version 1.0. They're not used very much these days, but for a lot of what people want from text formatting, they're precisely the right thing.

Perl formats allow you to draw up a picture of the data you want to output, and then paint the data into the format. For instance, in a recent application, I needed to display a set of IDs, dates, email addresses, and email subjects with one line per mail. If we assume that the line is fixed at 80 columns, we may need to truncate some of those fields and pad others to wider than their natural width. In pure Perl, there are basically three ways to get this sort of formatted output. There's sprintf (or printf) and substr:

  for (@mails) {
     
printf "%5i %10s %40s %21s\n",
          $_->id,
          substr($_->received,0,10),
          substr($_->from_address,-40,40),
          substr($_->subject,0,21);
  }

Then there's pack, which everyone forgets about (and which doesn't give as much control over truncation):

  for (@mails) {
      print pack("A5 A10 A40 A21\n",
        $_->id, $_->received, $_->from_address, $_->subject);
  }

And then there's the format:

  format STDOUT =
  @<<<< @<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<
  $_->id $_->received $_->from_address
                          $_->subject
  .

  for (@mails) {
       write; }

Personally, I think this is much neater and more intuitive than the other two solutions--and has the bonus that it takes the formatting away from the main loop, making the code less cluttered.*

Formats are associated with a particular filehandle; as you can see from the example, we've determined that this format should apply to anything we write on standard output. The picture language of formats is pretty simple: fields begin with @ or ^ and are followed by <, |, or > characters specifying left, center, and right justified respectively. After each line of fields comes a line of expressions that fill those fields, one expression for each field. If we like, we could change the format to multiple lines of fields and expressions:

  format STDOUT =
  Id      : @<<<<
  $_->id
  Date    : @<<<<<<<
  $_->received
  From    : @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  $_->from_address
  Subject : @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<<<<<<<<
  $_->subject 
  .

We've seen examples of the @-type field. If you're dealing with multi-line formats, you might find that you want to break up a value and show it across several lines of the format. For instance, to display the start of an email alongside metadata about it:

Id

: 1

Hi Simon, Thank you for the

Date

: 10/12/02

supply of widgets that you sent

From

: fred@funglyfoobar.com

me last week. I can assure you

Subject : Widgets

that they have all been put ...

This is where the other type of field, the ^ field, comes in: you can achieve the preceding output by using a format like this:  

format STDOUT =

 

Id : @<<<<

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

$_->id

$message

Date : @<<<<<<<

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

$_->received

$message

From : @<<<<<<<<<<<<<<<<<<<<

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

$_->from_address

$message

Subject : @<<<<<<<<<<<<<<<<<<<<...

^<<<<<<<<<<<<<<<<<<<<<<<<<<<...

$_->subject

$message

Unlike the values supplied to an @ field, which can be any Perl expression, these ^ values must take an ordinary scalar. What happens is that each time the format processor sees a ^ field, it outputs as much as it can from the supplied value and then chops that much off the beginning of the value for the next iteration. The ... sign at the end of the field indicates that if the supplied value is too long, the format should truncate the value and show three dots instead. If you use ^ fields with values found in lexical variables, such as $message in the previous example, you need to declare the lexical variable before the format, or else it won't be able to see the variable.

Another boon of using formats is that you can set a header to be sent out at the top of each page--Perl keeps track of how many lines have been printed by a format so it knows when to send out the next page. The header for a particular filehandle is a format named with _TOP appended to the filehandle's name. The simple use of this is to give column headers to your one-line records:

  format STDOUT_TOP =
  ID   Received From                 Subject
  ===== =========== ======================= 
=======================
  =============
  .

  format STDOUT =
  @<<<< @<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< <<<<<< @<<<<<<<<<<<<<<<<<<<
  $_->id $_->received $_->from_address  $_->subject
  .

Formats are quite handy, especially as you can associate different formats with different filehandles and send data out to multiple locations in different ways. On the other hand, they have some serious shortcomings that you should bear in mind if you're thinking of using them in a bigger application.

First, they're a camping ground for obscure special variables: $% is the current format page number, $= is the number of printable lines per page, $- is the number of lines currently left on the page, $~ is the name of the current output format, $^ is the name of the current header format, and so on. I could not remember a single one of these variables and had to look them up in perlvar.

Formats also deal pretty badly with lexical variables, changing filehandles, variable-length lines, changing formats on the fly, and so on. But they're handy for neat little hacks.

For complete details on Perl's built-in formats, read perlform.



 
 
>>> More Perl Programming Articles          >>> More By O'Reilly Media
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PERL PROGRAMMING ARTICLES

- Perl Turns 25
- Lists and Arguments in Perl
- Variables and Arguments in Perl
- Understanding Scope and Packages in Perl
- Arguments and Return Values in Perl
- Invoking Perl Subroutines and Functions
- Subroutines and Functions in Perl
- Perl Basics: Writing and Debugging Programs
- Structure and Statements in Perl
- First Steps in Perl
- Completing Regular Expression Basics
- Modifiers, Boundaries, and Regular Expressio...
- Quantifiers and Other Regular Expression Bas...
- Parsing and Regular Expression Basics
- Hash Functions

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: