Introduction to mod_perl (part 6): Even More Perl Basics - use(), require(), do(), %INC and @INC Explained (
Page 2 of 3 )
The @INC array
@INC is a special Perl variable which is the equivalent of the
shell's PATH variable. Whereas PATH contains a list of
directories to search for executables, @INC contains a list of
directories from which Perl modules and libraries can be loaded.
When you use(), require() or do() a filename or a module, Perl gets a
list of directories from the @INC variable and searches them for
the file it was requested to load. If the file that you want to load
is not located in one of the listed directories, you have to tell Perl
where to find the file. You can either provide a path relative to one
of the directories in @INC, or you can provide the full path to the
file.
The %INC hash
%INC is another special Perl variable that is used to cache the
names of the files and the modules that were successfully loaded and
compiled by use(), require() or do() statements. Before attempting to
load a file or a module with use() or require(), Perl checks whether
it's already in the %INC hash. If it's there, the loading and
therefore the compilation are not performed at all. Otherwise the file
is loaded into memory and an attempt is made to compile it. do() does
unconditional loading--no lookup in the %INC hash is made.
If the file is successfully loaded and compiled, a new key-value pair
is added to %INC. The key is the name of the file or module as it
was passed to the one of the three functions we have just mentioned,
and if it was found in any of the @INC directories except "."
the value is the full path to it in the file system.
The following examples will make it easier to understand the logic.
First, let's see what are the contents of @INC on my system:
% perl -e 'print join "\n", @INC'
/usr/lib/perl5/5.00503/i386-linux
/usr/lib/perl5/5.00503
/usr/lib/perl5/site_perl/5.005/i386-linux
/usr/lib/perl5/site_perl/5.005
.
Notice the . (current directory) is the last directory in the list.
Now let's load the module strict.pm and see the contents of %INC:
% perl -e 'use strict; print map {"$_ => $INC{$_}\n"} keys %INC'
strict.pm => /usr/lib/perl5/5.00503/strict.pm
Since strict.pm was found in /usr/lib/perl5/5.00503/ directory
and /usr/lib/perl5/5.00503/ is a part of @INC, %INC includes
the full path as the value for the key strict.pm.
Now let's create the simplest module in /tmp/test.pm:
test.pm
-------
1;
It does nothing, but returns a true value when loaded. Now let's load
it in different ways:
% cd /tmp
% perl -e 'use test; print map {"$_ => $INC{$_}\n"} keys %INC'
test.pm => test.pm
Since the file was found relative to . (the current directory), the
relative path is inserted as the value. If we alter @INC, by adding
/tmp to the end:
% cd /tmp
% perl -e 'BEGIN{push @INC, "/tmp"} use test; \
print map {"$_ => $INC{$_}\n"} keys %INC'
test.pm => test.pm
Here we still get the relative path, since the module was found first
relative to ".". The directory /tmp was placed after . in the
list. If we execute the same code from a different directory, the
"." directory won't match,
% cd /
% perl -e 'BEGIN{push @INC, "/tmp"} use test; \
print map {"$_ => $INC{$_}\n"} keys %INC'
test.pm => /tmp/test.pm
so we get the full path. We can also prepend the path with unshift(),
so it will be used for matching before "." and therefore we will
get the full path as well:
% cd /tmp
% perl -e 'BEGIN{unshift @INC, "/tmp"} use test; \
print map {"$_ => $INC{$_}\n"} keys %INC'
test.pm => /tmp/test.pm
The code:
BEGIN{unshift @INC, "/tmp"}
can be replaced with the more elegant:
use lib "/tmp";
Which is almost equivalent to our BEGIN block and is the
recommended approach.
These approaches to modifying @INC can be labor intensive, since if
you want to move the script around in the file-system you have to
modify the path. This can be painful, for example, when you move your
scripts from development to a production server.
There is a module called FindBin which solves this problem in the
plain Perl world, but unfortunately it won't work under mod_perl,
since it's a module and as any module it's loaded only once. So the
first script using it will have all the settings correct, but the rest
of the scripts will not if located in a different directory from the
first.
For the sake of completeness, I'll present this module anyway.
If you use this module, you don't need to write a hard coded path. The
following snippet does all the work for you (the file is
/tmp/load.pl):
load.pl
-------
#!/usr/bin/perl
use FindBin ();
use lib "$FindBin::Bin";
use test;
print "test.pm => $INC{'test.pm'}\n";
In the above example $FindBin::Bin is equal to /tmp. If we move
the script somewhere else... e.g. /tmp/x in the code above
$FindBin::Bin equals /home/x.
% /tmp/load.pl
test.pm => /tmp/test.pm
This is just like use lib except that no hard coded path is
required.
You can use this workaround to make it work under mod_perl.
do 'FindBin.pm';
unshift @INC, "$FindBin::Bin";
require test;
#maybe test::import( ... ) here if need to import stuff
This has a slight overhead because it will load from disk and
recompile the FindBin module on each request. So it may not be
worth it.
Modules, Libraries and Program Files
Before we proceed, let's define what we mean by module, library
and program file.
- Libraries
These are files which contain Perl subroutines and other code.
When these are used to break up a large program into manageable chunks
they don't generally include a package declaration; when they are used
as subroutine libraries they often do have a package declaration.
Their last statement returns true, a simple 1; statement ensures
that.
They can be named in any way desired, but generally their extension is
.pl.
Examples:
config.pl
----------
# No package so defaults to main::
$dir = "/home/httpd/cgi-bin";
$cgi = "/cgi-bin";
1;
mysubs.pl
----------
# No package so defaults to main::
sub print_header{
print "Content-type: text/plain\r\n\r\n";
}
1;
web.pl
------------
package web ;
# Call like this: web::print_with_class('loud',"Don't shout!");
sub print_with_class{
my( $class, $text ) = @_ ;
print qq{<span class="$class">$text</span>};
}
1;
- Modules
A file which contains perl subroutines and other code.
It generally declares a package name at the beginning of it.
Modules are generally used either as function libraries (which .pl
files are still but less commonly used for), or as object libraries
where a module is used to define a class and its methods.
Its last statement returns true.
The naming convention requires it to have a .pm extension.
Example:
MyModule.pm
-----------
package My::Module;
$My::Module::VERSION = 0.01;
sub new{ return bless {}, shift;}
END { print "Quitting\n"}
1;
- Program Files
Many Perl programs exist as a single file. Under Linux and other
Unix-like operating systems the file often has no suffix since the
operating system can determine that it is a perl script from the first
line (shebang line) or if it's Apache that executes the code, there is
a variety of ways to tell how and when the file should be executed.
Under Windows a suffix is normally used, for example .pl or
.plx.
The program file will normally require() any libraries and use()
any modules it requires for execution.
It will contain Perl code but won't usually have any package names.
Its last statement may return anything or nothing.
require()
require() reads a file containing Perl code and compiles it. Before
attempting to load the file it looks up the argument in %INC to see
whether it has already been loaded. If it has, require() just returns
without doing a thing. Otherwise an attempt will be made to load and
compile the file.
require() has to find the file it has to load. If the argument is a
full path to the file, it just tries to read it. For example:
require "/home/httpd/perl/mylibs.pl";
If the path is relative, require() will attempt to search for the file
in all the directories listed in @INC. For example:
require "mylibs.pl";
If there is more than one occurrence of the file with the same name in
the directories listed in @INC the first occurrence will be used.
The file must return TRUE as the last statement to indicate
successful execution of any initialization code. Since you never know
what changes the file will go through in the future, you cannot be
sure that the last statement will always return TRUE. That's why
the suggestion is to put ``1;'' at the end of file.
Although you should use the real filename for most files, if the file
is a module, you may use the following convention instead:
require My::Module;
This is equal to:
require "My/Module.pm";
If require() fails to load the file, either because it couldn't find
the file in question or the code failed to compile, or it didn't
return TRUE, then the program would die(). To prevent this the
require() statement can be enclosed into an eval() exception-handling
block, as in this example:
require.pl
----------
#!/usr/bin/perl -w
eval { require "/file/that/does/not/exists"};
if ($@) {
print "Failed to load, because : $@"
}
print "\nHello\n";
When we execute the program:
% ./require.pl
Failed to load, because : Can't locate /file/that/does/not/exists in
@INC (@INC contains: /usr/lib/perl5/5.00503/i386-linux
/usr/lib/perl5/5.00503 /usr/lib/perl5/site_perl/5.005/i386-linux
/usr/lib/perl5/site_perl/5.005 .) at require.pl line 3.
Hello
We see that the program didn't die(), because Hello was
printed. This trick is useful when you want to check whether a user
has some module installed, but if she hasn't it's not critical,
perhaps the program can run without this module with reduced
functionality.
If we remove the eval() part and try again:
require.pl
----------
#!/usr/bin/perl -w
require "/file/that/does/not/exists";
print "\nHello\n";
% ./require1.pl
Can't locate /file/that/does/not/exists in @INC (@INC contains:
/usr/lib/perl5/5.00503/i386-linux /usr/lib/perl5/5.00503
/usr/lib/perl5/site_perl/5.005/i386-linux
/usr/lib/perl5/site_perl/5.005 .) at require1.pl line 3.
The program just die()s in the last example, which is what you want in
most cases.
For more information refer to the perlfunc manpage.
use()
use(), just like require(), loads and compiles files containing Perl
code, but it works with modules only. The only way to pass a module
to load is by its module name and not its filename. If the module is
located in MyCode.pm, the correct way to use() it is:
use MyCode
and not:
use "MyCode.pm"
use() translates the passed argument into a file name replacing ::
with the operating system's path separator (normally /) and
appending .pm at the end. So My::Module becomes My/Module.pm.
use() is exactly equivalent to:
BEGIN { require Module; Module->import(LIST); }
Internally it calls require() to do the loading and compilation
chores. When require() finishes its job, import() is called unless
() is the second argument. The following pairs are equivalent:
use MyModule;
BEGIN {require MyModule; MyModule->import; }
use MyModule qw(foo bar);
BEGIN {require MyModule; MyModule->import("foo","bar"); }
use MyModule ();
BEGIN {require MyModule; }
The first pair exports the default tags. This happens if the module
sets @EXPORT to a list of tags to be exported by default. The
module's manpage normally describes what tags are exported by
default.
The second pair exports only the tags passed as arguments.
The third pair describes the case where the caller does not want any
symbols to be imported.
import() is not a builtin function, it's just an ordinary static
method call into the ``MyModule'' package to tell the module to
import the list of features back into the current package. See the
Exporter manpage for more information.
When you write your own modules, always remember that it's better to
use @EXPORT_OK instead of @EXPORT, since the former doesn't
export symbols unless it was asked to. Exports pollute the namespace
of the module user. Also avoid short or common symbol names to reduce
the risk of name clashes.
When functions and variables aren't exported you can still access them
using their full names, like $My::Module::bar or
$My::Module::foo(). By convention you can use a leading underscore
on names to informally indicate that they are internal and not for
public use.
There's a corresponding ``no'' command that un-imports symbols
imported by use, i.e., it calls Module->unimport(LIST)
instead of import().
do()
While do() behaves almost identically to require(), it reloads the
file unconditionally. It doesn't check %INC to see whether the file
was already loaded.
If do() cannot read the file, it returns undef and sets $! to
report the error. If do() can read the file but cannot compile it, it
returns undef and puts an error message in $@. If the file is
successfully compiled, do() returns the value of the last expression
evaluated.