Perl
  Home arrow Perl arrow Page 8 - Introduction to mod_perl (part 4): Perl Basics
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
PERL

Introduction to mod_perl (part 4): Perl Basics
By: Stas Bekman
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 6
    2003-01-03


    Table of Contents:
  • Introduction to mod_perl (part 4): Perl Basics
  • Using Global Variables and Sharing Them Between Modules/Packages
  • Making Variables Global With strict Pragma On
  • Using Exporter.pm to Share Global Variables
  • Using the Perl Aliasing Feature to Share Global Variables
  • Using Non-Hardcoded Configuration Module Names
  • The Scope of the Special Perl Variables
  • Compiled Regular Expressions
  • References

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Introduction to mod_perl (part 4): Perl Basics - Compiled Regular Expressions
    ( Page 8 of 9 )

    And finally I want to cover the pitfall many people has fallen into. Let's talk about regular expressions use under mod_perl.

    When using a regular expression that contains an interpolated Perl variable, if it is known that the variable (or variables) will not change during the execution of the program, a standard optimization technique is to add the /o modifier to the regex pattern. This directs the compiler to build the internal table once, for the entire lifetime of the script, rather than every time the pattern is executed. Consider:

      my $pat = '^foo$'; # likely to be input from an HTML form field
      foreach( @list ) {
        print if /$pat/o;
      }

    This is usually a big win in loops over lists, or when using the grep() or map() operators.

    In long-lived mod_perl scripts, however, the variable may change with each invocation and this can pose a problem. The first invocation of a fresh httpd child will compile the regex and perform the search correctly. However, all subsequent uses by that child will continue to match the original pattern, regardless of the current contents of the Perl variables the pattern is supposed to depend on. Your script will appear to be broken.

    There are two solutions to this problem:

    The first is to use eval q//, to force the code to be evaluated each time. Just make sure that the eval block covers the entire loop of processing, and not just the pattern match itself.

    The above code fragment would be rewritten as:

      my $pat = '^foo$';
      eval q{
        foreach( @list ) {
          print if /$pat/o;
        }
      }

    Just saying:

      foreach( @list ) {
        eval q{ print if /$pat/o; };
      }

    means that I recompile the regex for every element in the list even though the regex doesn't change.

    You can use this approach if you require more than one pattern match operator in a given section of code. If the section contains only one operator (be it an m// or s///), you can rely on the property of the null pattern, that reuses the last pattern seen. This leads to the second solution, which also eliminates the use of eval.

    The above code fragment becomes:

      my $pat = '^foo$';
      "something" =~ /$pat/; # dummy match (MUST NOT FAIL!)
      foreach( @list ) {
        print if //;
      }

    The only gotcha is that the dummy match that boots the regular expression engine must absolutely, positively succeed, otherwise the pattern will not be cached, and the // will match everything. If you can't count on fixed text to ensure the match succeeds, you have two possibilities.

    If you can guarantee that the pattern variable contains no meta-characters (things like *, +, ^, $...), you can use the dummy match:

      $pat =~ /\Q$pat\E/; # guaranteed if no meta-characters present

    If there is a possibility that the pattern can contain meta-characters, you should search for the pattern or the non-searchable \377 character as follows:

      "\377" =~ /$pat|^\377$/; # guaranteed if meta-characters present

    Another approach:

    It depends on the complexity of the regex to which you apply this technique. One common usage where a compiled regex is usually more efficient is to ``match any one of a group of patterns'' over and over again.

    Maybe with a helper routine, it's easier to remember. Here is one slightly modified from Jeffery Friedl's example in his book ``Mastering Regular Expressions''.

      #####################################################
      # Build_MatchMany_Function
      # -- Input:  list of patterns
      # -- Output: A code ref which matches its $_[0]
      #            against ANY of the patterns given in the
      #            "Input", efficiently.
      #
      sub Build_MatchMany_Function {
        my @R = @_;
        my $expr = join '||', map { "\$_[0] =~ m/\$R[$_]/o" } ( 0..$#R );
        my $matchsub = eval "sub { $expr }";
        die "Failed in building regex @R: $@" if $@;
        $matchsub;
      }

    Example usage:

      @some_browsers = qw(Mozilla Lynx MSIE AmigaVoyager lwp libwww);
      $Known_Browser=Build_MatchMany_Function(@some_browsers);
    
    
      while (<ACCESS_LOG>) {
        # ...
        $browser = get_browser_field($_);
        if ( ! &$Known_Browser($browser) ) {
          print STDERR "Unknown Browser: $browser\n";
        }
        # ...
      }

    In the next article I'll present a few other Perl basics directly related to the mod_perl programming.



     
     
    >>> More Perl Articles          >>> More By Stas Bekman
     

       

    PERL ARTICLES

    - More Perl Bits
    - Perl, Bit by Bit
    - Basic Charting with Perl
    - Using Getopt::Long: More Command Line Option...
    - Command Line Options in Perl: Using Getopt::...
    - Web Access with LWP
    - More Templating Tools for Perl
    - Site Layout with Perl Templating Tools
    - Build a Perl RSS Aggregator with Templating ...
    - Looping, Security, and Templating Tools
    - Perl: Bon Voyage Lists and Hashes
    - Templating Tools
    - Perl: Number Crunching
    - Perl Debuggers in Detail
    - Debugging Perl





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 6 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek