Debugging Perl

Every developer knows that debugging is one of the most important parts of coding. This two-part article focuses on Perl debuggers. It is excerpted from chapter four of Mastering Perl, written by Brian D Foy (O’Reilly; ISBN: 0596527241). Copyright © 2007 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.

The standard Perl distribution comes with a debugger, although it’s really just another Perl program, perl5db.pl. Since it is just a program, I can use it as the basis for writing my own debuggers to suit my needs, or I can use the interface perl5db.pl provides to configure its actions. That’s just the beginning, though. I can write my own debugger or use one of the many debuggers created by other Perl masters.

Before You Waste Too Much Time

Before I get started, I’m almost required to remind you that Perl offers two huge debugging aids: strict and warnings. I have the most trouble with smaller programs for which I don’t think I need strict and then I make the stupid mistakes it would have caught. I spend much more time than I should have tracking down something Perl would have shown me instantly. Common mistakes seem to be the hardest for me to debug. Learn from the master: don’t discount strict or warnings for even small programs.

Now that I’ve said that, you’re going to look for it in the examples in this chapter. Just pretend those lines are there, and the book costs a bit less for the extra half a page that I saved by omitting those lines. Or if you don’t like that, just imagine that I’m running every program with both strict and warnings turned on from the command line:

  $ perl -Mstrict -Mwarnings program

Along with that, I have another problem that bites me much more than I should be willing to admit. Am I editing the file on the same machine I’m running it on? I have login accounts on several machines, and my favorite terminal program has tabs so I can have many sessions in one window. It’s easy to checkout source from a repository and work just about anywhere. All of these nifty features conspire to get me into a situation where I’m editing a file in one window and trying to run it in another, thinking I’m on the same machine. If I’m making changes but nothing is changing in the output or behavior, it takes me longer than you’d think to figure out that the file I’m running is not the same one I’m editing. It’s stupid, but it happens. Discount nothing while debugging!

That’s a bit of a funny story, but I included it to illustrate a point: when it comes to debugging, Humility is one of the principal virtues of a maintenance programmer.* My best bet in debugging is to think that I’m the problem. That way, I don’t rule out anything or try to blame the problem on something else, like I often see in various Perl forums under titles such as “Possible bug in Perl.” When I suspect myself first, I’m usually right. Appendix B is my guide to solving any problem, which people have found useful for at least figuring out what might be wrong even if they can’t fix it.

{mospagebreak title=The Best Debugger in the World}

No matter how many different debugger applications or integrated development environments I use, I still find that plain ol’ print is my best debugger. I could load source into a debugger, set some inputs and breakpoints, and watch what happens, but often I can insert a couple of print statements and simply run the program normally.I put braces around the variable so I can see any leading or trailing whitespace:

  print "The value of var before is [$var]n";

  #… operations affecting $var;

  print "The value of var after is [$var]n";

I don’t really have to use print because I can do the same thing with warn, which sends its output to standard error:

  warn "The value of var before is [$var]";

  #… operations affecting $var;

  warn "The value of var after is [$var]";

Since I’ve left off the newline at the end of my warn message, it gives me the filename and line number of the warn:

  The value of var before is [$var] at program.pl line 123.

If I have a complex data structure, I use Data::Dumper to show it. It handles hash and array references just fine, so I use a different character, the angle brackets in this case, to offset the output that comes from Data::Dumper:

  use Data::Dumper qw(Dumper);
 
warn "The value of the hash is <n" . Dumper( %hash ) . "n>n";

Those warn statements showed the line number of the warn statement. That’s not very useful; I already know where the warn is since I put it there! I really want to know where I called that bit of code when it became a problem. Consider a divide subroutine that returns the quotient of two numbers. For some reason, something in the code calls it in such a way that it tries to divide by zero:

  sub divide
         
{
         
my( $numerator, $denominator ) = @_;

          return $numerator / $denominator;
          }

I know exactly where in the code it blows up because Perl tells me:

  Illegal division by zero at program.pl line 123.

I might put some debugging code in my subroutine. With warn, I can inspect the arguments:

  sub divide
         
{
         
my( $numerator, $denominator ) = @_;
         
warn "N: [$numerator] D: [$denominator]";

          return $numerator / $denominator;
          }

I might divide in many, many places in the code, so what I really need to know is which call is the problem. That warn doesn’t do anything more useful than show me the arguments.

Although I’ve called print the best debugger in the world, I actually use a disguised form in the carp function from the Carp module, part of the standard Perl distribution. It’s like warn, but it reports the filename and line number from the bit of code that called the subroutine:

  #!/usr/bin/perl
  use Carp qw(carp);

  printf "%.2fn", divide( 3, 4 );
  printf "%.2fn", divide( 1, 0 );
  printf "%.2fn", divide( 5, 4 );

  sub divide
         
{
         
my( $numerator, $denominator ) = @_;
         
carp "N: [$numerator] D: [$denominator]";

          return $numerator / $denominator;
          }

The output changes to something much more useful. Not only do I get my error message, but carp adds some information about the line of code that called it, and it shows me the argument list for the subroutine. I see that the call from line 4 is fine, but the call on line 5 is the last one before Perl kills the program:

  $ perl show-args.pl
  N: [3] D: [4] at show-args.pl line 11
                 main::divide(3, 4) called at show-args.pl line 4
  0.75
  N: [1] D: [0] at show-args.pl line 11
                  main::divide(1, 0) called at show-args.pl line 5
  Illegal division by zero at show-args.pl line 13.

The carp function is the better-informed version of warn. If I want to do the same thing with die, I use the croak function. It gives the same message as carp, but just like die, croak stops the program.

{mospagebreak title=Doing Whatever I Want}

I can change the warn and die functions myself by messing with %SIG. I like to use these to peer into code I’m trying to figure out, but I don’t use these to add features to code. It’s just part of my debugging toolbox.

The pseudokeys __WARN__ and __DIE__ hold the functions that perform those actions when I use the warn or die functions. I can use a reference to a named subroutine or an anonymous subroutine:

  $SIG{__DIE__} = &my_die_handler;
  $SIG{__DIE__} = sub { print "I’m about to die!" )

Without going through the entire code base, I can change all of the die calls into the more informative croak calls.§ In this example, I preface the subroutine call with an & and no parentheses to trigger Perl’s feature to pass on the current argument list to the next subroutine call so croak gets all of the arguments I pass:

  use Carp;
  $SIG{__DIE__} = sub { &Carp::croak };

  die "I’m going now!"; # really calls croak now

If I only want to do this for part of the code, I can use local (since %SIG is a special variable always in main::). My redefinition stays in effect until the end of the scope:

  local $SIG{__DIE__} = sub { &Carp::croak };

After either of my customized routines runs, the functions do what they would otherwise do; warn lets the program continue, and die continues its exception processing and eventually stops the program.

Since croak reports each level of the call stack and I called it from an anonymous subroutine, I get an artifact in my output:

  use Carp;

  print "Starting program…n";

  $SIG{__DIE__} = sub {
          local $Carp::CarpLevel = 0;

          &Carp::croak;
          };

  foo(); # program dies here

  sub foo { bar() }

  sub bar { die "Dying from bar!n"; }

In the stack trace, I see a subroutine call from __ANON__ followed by the subroutine calls I expect to bar() and foo():

  Starting program…
  Dying from bar!
  
at die.pl line 12
                 main::__ANON__(‘Dying from bar!x{a}’) called at die.pl line 20
                 main::bar() called at die.pl line 18
                 main::foo() called at die.pl line 16

I change my anonymous subroutine to adjust the position in the stack where croak starts its report. I set the value of $Carp::CarpLevel to the number of levels I want to skip, in this case just 1:

  $SIG{__DIE__} = sub {
          local $Carp::CarpLevel = 1;

          &Carp::croak;
          };

Now I don’t see the unwanted output:

  Starting program…
  Dying from bar!
  
at die.pl line 12
                  main::bar() called at die.pl line 18
                  main::foo() called at die.pl line 16

For a real-life example of this in action, check out the CGI::Carp module. Lincoln Stein uses the %SIG tricks to redefine warn and die in a web-friendly way. Instead of an annoying “Server Error 500” message, I can get useful error output by simply loading the module. While loading, CGI::Carp sets $SIG{__WARN__} and $SIG{__DIE__}:

  use CGI::Carp qw(fatalsToBrowser);

The fatalsToBrowser function takes over the resulting page to show me the error, but the module has other interesting functions such as set_message, which can catch compile-time errors and warningsToBrowser, which makes the warnings in HTML comments embedded in the output.

Of course, I don’t recommend that you use this in production code. I don’t want users to see the program’s errors. They can be handy when I have to debug a program on a remote server, but once I figure out the problem, I don’t need it anymore. By leaving it in there I let the public figure out how I’m doing things, and that’s bad for security.

{mospagebreak title=Program Tracing}

The Carp module also provides the cluck and confess subroutines to dump stack traces. cluck is akin to warn (or carp) in that it prints its message but lets the program continue. confess does the same thing, but like die, stops the program once it prints its mess.#

Both cluck and confess print stack traces, which show the list of subroutine calls and their arguments. Each subroutine call puts a frame with all of its information onto the stack. When the subroutine finishes, Perl removes the frame for that subroutine, and then Perl looks on the stack for the next frame to process. Alternately, if a subroutine calls another subroutine, that puts another frame on the stack.

Here’s a short program that has a chain of subroutine calls. I call the do_it function, which calls multiply_and_divide, which in turn calls the divide. Now, in this situation, I’m not getting the right answer for dividing 4 by 5. In this short example, you can probably spot the error right away, but imagine this is a huge mess of arguments, subroutine calls, and other madness:

  #!/usr/bin/perl
  use warnings;
 
use Carp qw(cluck);

  print join " ", do_it( 4, 5 ), "n";

  sub do_it
          {
          my( $n, $m ) = @_;

          my $sum = $n + $m;

          my( $product, $quotient ) =
                  multiply_and_divide( [ $n, $m ], 6, { cat => ‘buster’ } );

          return ( $sum, $product, $quotient );
          }

  sub multiply_and_divide
          {
          my( $n, $m ) = @{$_[0]};

          my $product = $n * $m;
          my $quotient = divide( $n, $n );

          return ( $product, $quotient );
          }

  sub divide
          {
          my( $n, $m ) = @_;
          my $quotient = $n / $m;
          }

I suspect that something is not right in the divide subroutine, but I also know that it’s at the end of a chain of subroutine calls. I want to examine the path that got me to divide, so I want a stack trace. I modify divide to use cluck, the warn version of Carp’s stack tracing, and I put a line of hyphens before and after the cluck() to set apart its output to make it easier to read:

  sub divide
          {
          print "-" x 73, "n";
          cluck();
          print "-" x 73, "n";

          my( $n, $m ) = @_;

          my $quotient = $n / $m;

          }

The output shows me the list of subroutine calls, with the most recent subroutine call first (so, the list shows the stack order). The stack trace shows me the package name, subroutine name, and the arguments. Looking at the arguments to divide, I see a repeated 4. One of those arguments should be 5. It’s not divide’s fault after all:

  —————————————
 
at confess.pl line 68 
  —————————————
                 main::divide(4, 4) called at confess.pl line 60
              main::multiply_and_divide(‘ARRAY(0x180064c)’) called at confess.pl line 49
              main::do_it(4, 5) called at confess.pl line 41
  9 20 1

It’s not a problem with divide, but with the information I sent to it. That’s from multiply_and_divide, and looking at its call to divide I see that I passed the same argument twice. If I’d been wearing my glasses, I might have been able to notice that $n might look like $m, but really isn’t:

  my $quotient = divide( $n, $n ); # WRONG

  my $quotient = divide( $n, $m ); # SHOULD BE LIKE THIS

This was a simple example, and still Carp had some problems with it. In the argument list for multiply_and_divide, I just get ‘ARRAY(0x180064c)’. That’s not very helpful. Luckily for me, I know how to customize modules (Chapters 9 and 10), and by looking at Carp, I find that the argument formatter is in Carp::Heavy. The relevant part of the subroutine has a branch for dealing with references:

  package Carp;
  # This is in Carp/Heavy.pm

  sub format_arg {
    my $arg = shift;

         …

    elsif (ref($arg)) {
            $arg = defined($overload::VERSION) ? overload::StrVal($arg) : "$arg";
    }
         …

    return $arg;
  }

If format_arg sees a reference, it checks for the overload module, which lets me define my own actions for Perl operations, including stringification. If Carp sees that I’ve somehow loaded overload, it tries to use the overload::StrVal subroutine to turn the reference into something I can read. If I haven’t loaded overload, it simply interpolates the reference in double quotes, yielding something like the
ARRAY(0x180064c) I saw before.

The format_arg function is a bit simple-minded, though. I might have used the overload module in one package, but that doesn’t mean I used it in another. Simply checking that I’ve used it once somewhere in the program doesn’t mean it applies to every reference. Additionally, I might not have even used it to stringify references. Lastly, I can’t really retroactively use overload for all the objects and references in a long stack trace, especially when I didn’t create most of those modules. I need a better way.

I can override Carp’s format_arg to do what I need. I copy the existing code into a BEGIN block so I can bend it to my will. First, I load its original source file, Carp::Heavy, so I get the original definition loaded first. I replace the subroutine definition by assigning to its typeglob. If the subroutine argument is a reference, I pull in Data::Dumper, set some Dumper parameters to fiddle with the output format, then get its stringified version of the argument:

  BEGIN {
  use Carp::Heavy;

  no warnings ‘redefine';
 
*Carp::format_arg = sub {
          package Carp;
          my $arg = shift;

          if( not defined $arg )
                  { $arg = ‘undef’ }
         
elsif( ref $arg )
                  {
                  use Data::Dumper;
                  local $Data::Dumper::Indent = 0; # salt to taste
                  local $Data::Dumper::Terse = 0;
                  $arg = Dumper( $arg );
                  $arg =~ s/^$VARd+s*=s*//;
                  $arg =~ s/;s*$//;
                  } 
         
else
                  {
                  $arg =~ s/’/’/g;
                  $arg = str_len_trim($arg, $MaxArgLen);
                  $arg = "’$arg’" unless  
$arg =~ /^-?[d.]+z/;
                  }

          $arg =~ s/([[:cntrl:]]|[[:^ascii:]])/sprintf("\x{%x}",ord($1))/eg;
          return $arg;
          };
 
}

I do a little bit of extra work on the Dumper output. It normally gives me something I can use in eval, so it’s a Perl expression with an assignment to a scalar and a trailing semicolon. I use a couple of substitutions to get rid of these extras. I want to get rid of the Data::Dumper artifacts on the ends:

  $VAR = … ; # leave just the …

Now, when I run the same program I had earlier, I get better output. I can see in elements of the anonymous array that I passed to multiply_and_divide:

  ————————————–
  at confess.pl line 65
                 main::divide(4, 4) called at confess.pl line 57
                 main::multiply_and_divide([4,5]) called at confess.pl line 46
                 main::do_it(4, 5) called at confess.pl line 38 
 
9 20 1

The best part of all of this, of course, is that I only had to add cluck in one subroutine to get all of this information. I’ve used this for very complex situations with lots of arguments and complex data structures, giving me a Perl-style stack dump. It may be tricky to go through, but it’s almost painless to get (and to disable, too).

{mospagebreak title=Safely Changing Modules}

In the previous section I changed &Carp::format_arg to do something different. The general idea is very useful for debugging since I’m not only going to find bugs in the code that I write, but most often in the modules I use or in code that someone else wrote.

When I need to debug these things in other files, I want to add some debugging statements or change the code somehow to see what happens. However, I don’t want to change the original source files; whenever I do that I tend to make things worse no matter how careful I am to restore them to their original state. Whatever I do, I want to erase any damage I do and I don’t want it to affect anyone else.

I do something simple: copy the questionable module file to a new location. I set up a special directory for the debugging section just to ensure that my mangled versions of the modules won’t infect anything else. Once I do that, I set the PERL5LIB environment variable so Perl finds my mangled version first. When I’m done debugging, I can clear PERL5LIB to use the original versions again.

For instance, I recently needed to check the inner workings of Net::SMTP because I didn’t think it was handling the socket code correctly. I choose a directory to hold my copies, in this case ~/my_debug_lib, and set PERL5LIB to that path. I then create the directories I need to store the modified versions, then copy the module into it:

  $ export PERL5LIB=~/my_debug_lib
  $ mkdir -p ~/my_debug_lib/Net/
  $ cp `perldoc -l Net::SMTP` ~/my_debug_lib/Net/.

Now, I can edit ~/my_debug_lib/Net/SMTP.pm, run my code to see what happens, and work toward a solution. None of this has affected anyone else. I can do all the things I’ve already showed in this chapter, including inserting confess statements at the right places to get a quick dump of the call stack. Every time I wanted to investigate a new module, I copied it into my temporary debugging library directory.

Please check back next week for the conclusion to this article.

[gp-comments width="770" linklove="off" ]

chat sex hikayeleri Ensest hikaye