Perl
  Home arrow Perl arrow Page 3 - Introduction to mod_perl (part 5): Mor...
Dev Shed Forums 
Administration  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Download TestComplete 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PERL

Introduction to mod_perl (part 5): More Perl Basics
By: Stas Bekman
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 5
    2003-03-12

    Table of Contents:
  • Introduction to mod_perl (part 5): More Perl Basics
  • my() Scoped Variable in Nested Subroutines
  • When You Cannot Get Rid of The Inner Subroutine
  • perldoc's Rarely Known But Very Useful Options
  • References

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
     
    ADVERTISEMENT

    Dell PowerEdge Servers

    Introduction to mod_perl (part 5): More Perl Basics - When You Cannot Get Rid of The Inner Subroutine
    (Page 3 of 5 )

    First you might wonder, why in the world will someone need to definean inner subroutine? Well, for example to reduce some of Perl's scriptstartup overhead you might decide to write a daemon that will compilethe scripts and modules only once, and cache the pre-compiled code inmemory. When some script is to be executed, you just tell the daemonthe name of the script to run and it will do the rest and do it muchfaster since compilation has already taken place.

    Seems like an easy task, and it is. The only problem is once thescript is compiled, how do you execute it? Or let's put it the otherway: after it was executed for the first time and it stays compiled inthe daemon's memory, how do you call it again? If you could get alldevelopers to code their scripts so each has a subroutine called run()that will actually execute the code in the script then we've solvedhalf the problem.

    But how does the daemon know to refer to some specific script if theyall run in the main:: name space? One solution might be to ask thedevelopers to declare a package in each and every script, and for thepackage name to be derived from the script name. However, since thereis a chance that there will be more than one script with the same namebut residing in different directories, then in order to preventnamespace collisions the directory has to be a part of the packagename too. And don't forget that the script may be moved from onedirectory to another, so you will have to make sure that the packagename is corrected every time the script gets moved.

    But why enforce these strange rules on developers, when we can arrangefor our daemon to do this work? For every script that the daemon isabout to execute for the first time, the script should be wrappedinside the package whose name is constructed from the mangled path tothe script and a subroutine called run(). For example if the daemon isabout to execute the script /tmp/hello.pl:

    hello.pl
    --------
    #!/usr/bin/perl
    print "Hello\n";

    Prior to running it, the daemon will change the code to be:

    wrapped_hello.pl
    ----------------
    package cache::tmp::hello_2epl;
    
    sub run{
    #!/usr/bin/perl 
    print "Hello\n";
    }

    The package name is constructed from the prefix cache::, eachdirectory separation slash is replaced with ::, and nonalphanumeric characters are encoded so that for example . (a dot)becomes _2e (an underscore followed by the ASCII code for a dot inhex representation).

    % perl -e 'printf "%x",ord(".")'

    prints: 2e. The underscore is the same you see in URL encodingexcept the % character is used instead (%2E), but since % hasa special meaning in Perl (prefix of hash variable) it couldn't beused.

    Now when the daemon is requested to execute the script/tmp/hello.pl, all it has to do is to build the package name asbefore based on the location of the script and call its run()subroutine:

    use cache::tmp::hello_2epl;
    cache::tmp::hello_2epl::run();

    We have just written a partial prototype of the daemon we wanted. Theonly outstanding problem is how to pass the path to the script to thedaemon. This detail is left as an exercise for the reader.

    If you are familiar with the Apache::Registry module, you know thatit works in almost the same way. It uses a different package prefixand the generic function is called handler() and not run(). Thescripts to run are passed through the HTTP protocol's headers.

    Now you understand that there are cases where your normal subroutinescan become inner, since if your script was a simple:

    simple.pl
    ---------
    #!/usr/bin/perl 
    sub hello { print "Hello" }
    hello();

    Wrapped into a run() subroutine it becomes:

    simple.pl
    ---------
    package cache::simple_2epl;
    
    sub run{
    #!/usr/bin/perl 
    sub hello { print "Hello" }
    hello();
    }

    Therefore, hello() is an inner subroutine and if you have used my()scoped variables defined and altered outside and used inside hello(),it won't work as you expect starting from the second call, as wasexplained in the previous section.

    Remedies for Inner Subroutines

    First of all there is nothing to worry about, as long as you don'tforget to turn the warnings On. If you do happen to have the ``my()Scoped Variable in Nested Subroutines'' problem, Perl will alwaysalert you.

    Given that you have a script that has this problem, what are the waysto solve it? There are many of them and we will discuss some of themhere.

    We will use the following code to show the different solutions.

    multirun.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run{
    my $counter = 0;
    increment_counter();
    increment_counter();
    sub increment_counter{
    $counter++;
    print "Counter is equal to $counter !\n";
    }
    } # end of sub run

    This code executes the run() subroutine three times, which in turninitializes the $counter variable to 0, every time it is executedand then calls the inner subroutine increment_counter() twice. Subincrement_counter() prints $counter's value after incrementingit. One might expect to see the following output:

    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 3]
    Counter is equal to 1 !
    Counter is equal to 2 !

    But as we have already learned from the previous sections, this is notwhat we are going to see. Indeed, when we run the script we see:

    % ./multirun.pl
    Variable "$counter" will not stay shared at ./nested.pl line 18.
    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 3 !
    Counter is equal to 4 !
    run: [time 3]
    Counter is equal to 5 !
    Counter is equal to 6 !

    Obviously, the $counter variable is not reinitialized on eachexecution of run(). It retains its value from the previous execution,and sub increment_counter() increments that.

    One of the workarounds is to use globally declared variables, with thevars pragma.

    multirun1.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    use vars qw($counter);
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run {
    $counter = 0;
    increment_counter();
    increment_counter();
    sub increment_counter{
    $counter++;
    print "Counter is equal to $counter !\n";
    }
    } # end of sub run

    If you run this and the other solutions offered below, the expectedoutput will be generated:

    % ./multirun1.pl
    
    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 3]
    Counter is equal to 1 !
    Counter is equal to 2 !

    By the way, the warning we saw before has gone, and so has theproblem, since there is no my() (lexically defined) variable usedin the nested subroutine.

    Another approach is to use fully qualified variables. This is better,since less memory will be used, but it adds a typing overhead:

    multirun2.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run {
    $main::counter = 0;
    increment_counter();
    increment_counter();
    sub increment_counter{
    $main::counter++;
    print "Counter is equal to $main::counter !\n";
    }
    } # end of sub run

    You can also pass the variable to the subroutine by value and make thesubroutine return it after it was updated. This adds time and memoryoverheads, so it may not be good idea if the variable can be verylarge, or if speed of execution is an issue.

    Don't rely on the fact that the variable is small during thedevelopment of the application, it can grow quite big in situationsyou don't expect. For example, a very simple HTML form text entryfield can return a few megabytes of data if one of your users is boredand wants to test how good your code is. It's not uncommon to seeusers copy-and-paste 10Mb core dump files into a form's text fieldsand then submit it for your script to process.

    multirun3.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run {
    my $counter = 0;
    $counter = increment_counter($counter);
    $counter = increment_counter($counter);
    sub increment_counter{
    my $counter = shift;
    $counter++;
    print "Counter is equal to $counter !\n";
    return $counter;
    }
    } # end of sub run

    Finally, you can use references to do the job. The version ofincrement_counter() below accepts a reference to the $countervariable and increments its value after first dereferencing it. Whenyou use a reference, the variable you use inside the function isphysically the same bit of memory as the one outside the function.This technique is often used to enable a called function to modifyvariables in a calling function.

    multirun4.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run {
    my $counter = 0;
    increment_counter(\$counter);
    increment_counter(\$counter);
    sub increment_counter{
    my $r_counter = shift;
    $$r_counter++;
    print "Counter is equal to $$r_counter !\n";
    }
    } # end of sub run

    Here is yet another and more obscure reference usage. We modify thevalue of $counter inside the subroutine by using the fact thatvariables in @_ are aliases for the actual scalar parameters. Thusif you called a function with two arguments, those would be stored in$_[0] and $_[1]. In particular, if an element $_[0] isupdated, the corresponding argument is updated (or an error occurs ifit is not updatable as would be the case of calling the function witha literal, e.g. increment_counter(5)).

    multirun5.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }
    sub run {
    my $counter = 0;
    increment_counter($counter);
    increment_counter($counter);
    sub increment_counter{
    $_[0]++;
    print "Counter is equal to $_[0] !\n";
    }
    } # end of sub run

    The approach given above is generally not recommended because mostPerl programmers will not expect $counter to be changed by thefunction; the example where we used \$counter,i.e. pass-by-reference would be preferred.

    Here is a solution that avoids the problem entirely by splitting thecode into two files; the first is really just a wrapper and loader,the second file contains the heart of the code.

    multirun6.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    require 'multirun6-lib.pl' ;
    for (1..3){
    print "run: [time $_]\n";
    run();
    }

    Separate file:

    multirun6-lib.pl
    ----------------
    use strict ;
    
    my $counter;
    sub run {
    $counter = 0;
    increment_counter();
    increment_counter();
    }
    
    sub increment_counter{
    $counter++;
    print "Counter is equal to $counter !\n";
    }
    1 ;

    Now you have at least six workarounds to choose from.

    For more information please refer to perlref and perlsub manpages.

    More Perl Articles
    More By Stas Bekman


     

       

    PERL ARTICLES

    - Perl: A Continuing Look at Hashes and Multid...
    - Perl: Another Round with Hashes
    - Perl Hashes
    - Perl Lists: A Final Look at List::Util
    - Perl Lists: Utilizing List::Util
    - Perl Lists: The Split() Function
    - SQL and CGI with Perl and DBI
    - Perl Lists: More Functions and Operators
    - SELECT Queries and Perl
    - Perl Lists: More on Manipulation
    - Creating a Database with Perl and DBI
    - Perl: Sailing the List(less) Seas
    - Perl and DBI
    - Perl: Concatenating Text and More
    - Perl Text: Quoting Without Quote Marks

     
    Accelerating Trading Partner Performance
     
    Competing on Analytics
     
    Cost Effective Scaling with Virtualization and Coyote Point Systems
     
    Five Checkpoints to Implementing IP Telephony
     
    Hosted Email Security: Staying Ahead of New Threats
     




    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway