Home arrow Perl Programming arrow Introduction to mod_perl (part 6): Even More Perl Basics

Introduction to mod_perl (part 6): Even More Perl Basics

This article is a third one in a series talking about the essentialPerl basics, that you should know before starting to program formod_perl.

  1. Introduction to mod_perl (part 6): Even More Perl Basics
  2. use(), require(), do(), %INC and @INC Explained
  3. References
By: Stas Bekman
Rating: starstarstarstarstar / 5
April 08, 2003

print this article



You will hear a lot about namespaces, symbol tables and lexicalscoping in Perl discussions, but little of it will make any sensewithout a few key facts:

Symbols, Symbol Tables and Packages; Typeglobs

There are two important types of symbol: package global and lexical.We will talk about lexical symbols later, for now we will talk onlyabout package global symbols, which we will refer to simply asglobal symbols.

The names of pieces of your code (subroutine names) and the names ofyour global variables are symbols. Global symbols reside in onesymbol table or another. The code itself and the data do not; thesymbols are the names of pointers which point (indirectly) to thememory areas which contain the code and data. (Note for C/C++programmers: we use the term `pointer' in a general sense of one pieceof data referring to another piece of data not in a specific sense asused in C or C++.)

There is one symbol table for each package, (which is why globalsymbols are really package global symbols).

You are always working in one package or another.

Like in C, where the first function you write must be called main(),the first statement of your first Perl script is in package main::which is the default package. Unless you say otherwise by using thepackage statement, your symbols are all in package main::. Youshould be aware straight away that files and packages are notrelated. You can have any number of packages in a single file; and asingle package can be in one file or spread over many files. Howeverit is very common to have a single package in a single file. Todeclare a package you write:

package mypackagename;

From the following line you are in package mypackagename and anysymbols you declare reside in that package. When you create a symbol(variable, subroutine etc.) Perl uses the name of the package in whichyou are currently working as a prefix to create the fully qualifiedname of the symbol.

When you create a symbol, Perl creates a symbol table entry for thatsymbol in the current package's symbol table (by defaultmain::). Each symbol table entry is called a typeglob. Eachtypeglob can hold information on a scalar, an array, a hash, asubroutine (code), a filehandle, a directory handle and a format, eachof which all have the same name. So you see now that there are twoindirections for a global variable: the symbol, (the thing's name),points to its typeglob and the typeglob for the thing's type (scalar,array, etc.) points to the data. If we had a scalar and an array withthe same name their name would point to the same typeglob, but foreach type of data the typeglob points to somewhere different and sothe scalar's data and the array's data are completely separate andindependent, they just happen to have the same name.

Most of the time, only one part of a typeglob is used (yes, it's a bitwasteful). You will by now know that you distinguish between them byusing what the authors of the Camel book call a funny character. Soif we have a scalar called `line' we would refer to it in code as$line, and if we had an array of the same name, that would bewritten, @line. Both would point to the same typeglob (which wouldbe called *line), but because of the funny character (also knownas decoration) perl won't confuse the two. Of course we mightconfuse ourselves, so some programmers don't ever use the same namefor more than one type of variable.

Every global symbol is in some package's symbol table. To refer to aglobal symbol we could write the fully qualified name,e.g. $main::line. If we are in the same package as the symbol wecan omit the package name, e.g. $line (unless you use the <strict>pragma and then you will have to predeclare the variable using thevars pragma). We can also omit the package name if we have importedthe symbol into our current package's namespace. If we want to referto a symbol that is in another package and which we haven't importedwe must use the fully qualified name, e.g. $otherpkg::box.

Most of the time you do not need to use the fully qualified symbolname because most of the time you will refer to package variables fromwithin the package. This is very like C++ class variables. You canwork entirely within package main:: and never even know you areusing a package, nor that the symbols have package names. In a way,this is a pity because you may fail to learn about packages and theyare extremely useful.

The exception is when you import the variable from another package.This creates an alias for the variable in the current package, sothat you can access it without using the fully qualified name.

Whilst global variables are useful for sharing data and are necessaryin some contexts it is usually wisest to minimise their use and uselexical variables, discussed next, instead.

Note that when you create a variable, the low-level business ofallocating memory to store the information is handled automatically byPerl. The intepreter keeps track of the chunks of memory to which thepointers are pointing and takes care of undefining variables. When allreferences to a variable have ceased to exist then the perl garbagecollector is free to take back the memory used ready forrecycling. However perl almost never returns back memory it hasalready used to the operating system during the lifetime of theprocess.

Lexical Variables and Symbols

The symbols for lexical variables (i.e. those declared using thekeyword my) are the only symbols which do not live in a symboltable. Because of this, they are not available from outside the blockin which they are declared. There is no typeglob associated with alexical variable and a lexical variable can refer only to a scalar, anarray or a hash.

If you need access to the data from outside the package then you canreturn it from a subroutine, or you can create a global variable(i.e. one which has a package prefix) which points or refers to it andreturn that. The pointer or reference must be global so that you canrefer to it by a fully qualified name. But just like in C try to avoidhaving global variables. Using OO methods generally solves thisproblem, by providing methods to get and set the desired value withinthe object that can be lexically scoped inside the package and passedby reference.

The phrase ``lexical variable'' is a bit of a misnomer, we are reallytalking about ``lexical symbols''. The data can be referenced by aglobal symbol too, and in such cases when the lexical symbol goes outof scope the data will still be accessible through the global symbol.This is perfectly legitimate and cannot be compared to the terriblemistake of taking a pointer to an automatic C variable and returningit from a function--when the pointer is dereferenced there will be asegmentation fault. (Note for C/C++ programmers: having a functionreturn a pointer to an auto variable is a disaster in C or C++; theperl equivalent, returning a reference to a lexical variable createdin a function is normal and useful.)

  • my() vs. use vars:

    With use vars(), you are making an entry in the symbol table, and youare telling the compiler that you are going to be referencing thatentry without an explicit package name.

    With my(), NO ENTRY IS PUT IN THE SYMBOL TABLE. The compiler figuresout at compile time which my() variables (i.e. lexical variables)are the same as each other, and once you hit execute time you cannotgo looking those variables up in the symbol table.

  • my() vs. local():

    local() creates a temporal-limited package-based scalar, array, hash,or glob -- when the scope of definition is exited at runtime, theprevious value (if any) is restored. References to such a variableare *also* global... only the value changes. (Aside: that is whatcauses variable suicide. :)

    my() creates a lexically-limited non-package-based scalar, array, orhash -- when the scope of definition is exited at compile-time, thevariable ceases to be accessible. Any references to such a variableat runtime turn into unique anonymous variables on each scope exit.

>>> More Perl Programming Articles          >>> More By Stas Bekman

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort


- Perl Turns 25
- Lists and Arguments in Perl
- Variables and Arguments in Perl
- Understanding Scope and Packages in Perl
- Arguments and Return Values in Perl
- Invoking Perl Subroutines and Functions
- Subroutines and Functions in Perl
- Perl Basics: Writing and Debugging Programs
- Structure and Statements in Perl
- First Steps in Perl
- Completing Regular Expression Basics
- Modifiers, Boundaries, and Regular Expressio...
- Quantifiers and Other Regular Expression Bas...
- Parsing and Regular Expression Basics
- Hash Functions

Developer Shed Affiliates


Dev Shed Tutorial Topics: