Scalars and Variables

In this fourth part of a five-part series on scalars in Perl, you learn how to compare the value of strings; we’ll also wrap up our discusssion of operators and move on to variables. This article is excerpted from chapter two of the book Beginning Perl, written by James Lee (Apress; ISBN: 159059391X).

String Comparison

As well as comparing the value of numbers, we can compare the value of strings. This does not mean we convert a string to a number, although if you say something like "12" > "30" , Perl will convert to numbers for you. This means we can compare the strings alphabetically: “Bravo” comes after “Alpha” but before “Charlie”, for instance.

In fact, it’s more than alphabetical order; the computer is using either ASCII or Unicode internally to represent the string, and so has converted it to a series of numbers in the relevant sequence. This means, for example, “Fowl” comes before “fish”, because a capital “F” has a smaller ASCII value (70) than a lowercase “f” (102).1

1. This is not strictly true, though. Locales can define nonnumeric sorting orders for ASCII or Unicode characters that Perl will respect.

We can find a character’s value by using the ord() function, which tells us where in the (ASCII) order it comes. Let’s see which comes first, a # or a * ?

#!/usr/bin/perl -w
# ascii.pl

print "A # has ASCII value ", ord("#"), "n";
print "A * has ASCII value ", ord("*"), "n";

This should say

\$ perl ascii.pl
A # has ASCII value 35
A * has ASCII value 42
\$

If we’re only concerned with a character at a time, we can compare the return values of ord() using the
< and > operators. However, when comparing entire strings, it may get a bit tedious. If the first character of each string is the same, you would move on to the next character in each string, and then the next, and so on.

Instead, there are string comparison operators that do this for us. Whereas the comparison operators for numbers are mathematical symbols, the operators for strings are abbreviations. To test whether one string is less than another, use lt . “Greater than” becomes gt , “equal to” becomes eq , and “not equal to” becomes ne . There’s also ge and le for “greater than or equal to” and “less than and equal to.” The three-way-comparison becomes cmp .

Here are a few examples of these:

#!/usr/bin/perl – w
# strcomp1.pl

print "Which came first, the chicken or the egg? ";
print "chicken" cmp "egg", "n";
print "Are dogs greater than cats? ";
print "dog" gt "cat", "n";
print "Is ^ less than + ? ";
print "^" lt "+", "n";

And the results:

\$ perl strcomp1.pl
Which came first, the chicken or the egg? -1 Are dogs greater than cats? 1
Is ^ less than + ?
\$

The last line prints nothing as a result of "^" lt "+" since this operation returns the empty string indicating false.

Be careful when comparing strings with numeric comparison operators (or numeric values with string comparison operators):

#!/usr/bin/perl -w
# strcomp2.pl

print "Test one: ", "four" eq "six", "n"; print "Test two: ", "four" == "six", "n";

This code produces

\$ perl strcomp2.pl
Argument "six" isn’t numeric in numeric eq (==) at strcmp2.pl line 5.
Argument "four" isn’t numeric in numeric eq (==) at strcmp2.pl line 5.
Test one:
Test two: 1
\$

Is the second line really claiming that "four" is equal to "six" ? Yes, when treated as numbers. If you compare them as numbers, they get converted to numbers. "four" converts to 0, "six" converts to 0, and the 0s are equal, so our test returns true and we get a couple of warnings telling us that they were not numbers to begin with. The moral of this story is, compare strings with string comparison operators and compare numbers with numeric comparison operators. Other wise, your results may not be what you anticipate.

{mospagebreak title=Operators to Be Seen Later}

There are a few operators left that we are not going to go into in detail right now. Don’t worry, we will eventually come across the more important ones.

• The conditional operator looks like this: a?b:c . It returns b if a is true, and c if it is false.

• The range operators, .. and , make a range of values. For instance, (0..5) is shorthand notation for (0,1,2,3,4,5).

• We’ve seen the comma for separating arguments to functions like print() . In fact, the comma is an operator that builds a list, and print() works on a list of arguments. The operator => works like a comma with certain additional properties.

• The =~ and !~ operators are used to “apply” a regular expression to a string. More on these operators in Chapter 7.

• As well as providing an escape sequence and backwhacking special characters, is used to take a reference to a variable, to examine the variable itself rather than its contents. We will discuss this operator in Chapter 11.

• The >> and << operators “shift” a binary number right and left a given number of bits.
• -> is an operator used when working with references, covered in Chapter 11.

Operator Precedence

Table 2-1 provides the precedence for all the operators we’ve seen so far, listed in descending order of precedence.

Table 2-1. Operator Precedence

 Operator Description List operators Functions that take list arguments -> Infix dereference operator ** Exponentiation ! ~ Logical not, bitwise not, reference of =~ !~ Regex match, negated regex match * / % x Multiplication, division, modulus, replication + – . Addition, subtraction, concatenation << >> Left shift, right shift < > <= >= lt gt le ge Comparison operators == != <=> eq ne cmp More comparison operators & Bitwise and | ^ Bitwise or, bitwise xor && Logical and || Logical or .. … Range ?: Conditional , => List separator not Logical not and Logical and or xor Logical or, xor

Remember that if you need to get things done in a different order, you will need to use parenthesis. Also remember that you can use parenthesis even when they’re not strictly neces sary, and you should certainly do so to help keep things readable. While Perl knows full well what order to do 7+3*2/6-3+5/2&3 in, you may find it easier on yourself to spell it out, because next week you may not remember everything you have just written.

{mospagebreak title=Variables}

Now it is time to talk about variables. As explained earlier, a variable is storage for your scalars. Once you’ve calculated 42*7, it’s gone. If you want to know what it was, you must do the calculation again. Instead of being able to use the result as a halfway point in more complicated calculations, you’ve got to spell it all out in full. That’s no fun. What we need to be able to do, and what variables allow us to do, is store a scalar away and refer to it again later.

A scalar variable name starts with a dollar sign, for example: \$name . Scalar variables can hold either numbers or strings, and are only limited by the size of your computer’s memory. To put data into our scalar, we assign the data to it with the assignment operator = . (Incidentally, this is why numeric comparison is == , because = was taken to mean the assignment operator.)

What we’re going to do here is tell Perl that our scalar contains the string "fred" . Now we can get at that data by simply using the variable’s name:

#!/usr/bin/perl – w
# vars1.pl

\$name = "fred";
print "My name is ", \$name, "n";

Lo and behold, our computer announces to us that

\$ perl vars1.pl
My name is fred
\$

Now we have somewhere to store our data, and some way to get it back again. The next logical step is to be able to change it.

Modifying a Variable

Modifying the contents of a variable is easy, just assign something different to it. We can say

#!/usr/bin/perl -w
# vars2.pl

\$name = "fred";
print "My name is ",               \$name, "n";
print "It’s still ",               \$name, "n";
\$name = "bill";
print "Well, actually, now it’s ", \$name, "n";
\$name = "fred";
print "No, really, now it’s ",     \$name, "n";

And watch our computer have an identity crisis:

\$ perl vars2.pl
My name is fred
It’s still fred
Well, actually, now it’s bill
No, really, now it’s fred
\$

We can also do a calculation in several stages:

#!/usr/bin/perl -w
# vars3.pl

\$a = 6 * 9;
print "Six nines are ", \$a, "n";
\$b = \$a + 3;
print "Plus three is ", \$b, "n";
\$c = \$b / 3;
print "All over three is ", \$c, "n";
\$d = \$c + 1;
print "Add one is ", \$d, "n";
print "nThose stages again: ", \$a, " ", \$b, " ", \$c, " ", \$d, "n";

This code prints

\$ perl vars3.pl
Six nines are 54
Plus three is 57
All over three is 19
Add one is 20
Those stages again: 54 57 19 20
\$

While this works perfectly fine, it’s often easier to stick with one variable and modify its value, if you don’t need to know the stages you went through at the end:

#!/usr/bin/perl -w
# vars4.pl

\$a = 6 * 9;
print "Six nines are ", \$a, "n";
\$a = \$a + 3;
print "Plus three is ", \$a, "n";
\$a = \$a / 3;
print "All over three is ", \$a, "n"; \$a = \$a + 1;
print "Add one is ", \$a, "n";

The assignment operator =, has very low precedence. This means that Perl will do the calculations on the right-hand side of it, including fetching the current value, before assigning the new value. To illustrate this, take a look at the sixth line of our example. Perl takes the current value of \$a, adds three to it, and then stores it back in \$a.

{mospagebreak title=Operating and Assigning at Once}

Operations, like fetching a value, modifying it, or storing it, are very common, so there’s a special syntax for them. Generally

\$a = \$a <some operator> \$b;

can be written as

\$a <some operator>= \$b;

For instance, we could rewrite the preceding example as follows:

#!/usr/bin/perl -w
# vars5.pl

\$a = 6 * 9;
print "Six nines are ", \$a, "n";
\$a += 3;
print "Plus three is ", \$a, "n";
\$a /= 3;
print "All over three is ", \$a, "n";
\$a += 1;
print "Add one is ", \$a, "n";

This works for **= , *= , += , -= , /= , .= , %= , &= , |= , ^= , <<= , >>= , &&= , and ||= . These all have the same precedence as the assignment operator = .

Autoincrement and Autodecrement

There are also two more operators, ++ and . They add and subtract one from the variable, but their precedence is a little strange. When they precede a variable, they act before every thing else. If they come afterwards, they act after everything else. Let’s examine these in the following example:

#!/usr/bin/perl – w
# auto1.pl

\$a = 4;
\$b = 10;
print "Our variables are ", \$a, " and ", \$b, "n";
\$b = \$a++;
print "After incrementing, we have ", \$a, " and ", \$b, "n";
\$b = ++\$a * 2;
print "Now, we have ", \$a, " and ", \$b, "n";
\$a = –\$b + 4;
print "Finally, we have ", \$a, " and ", \$b, "n";

You should see the following output:

\$ perl auto1.pl
Our variables are 4 and 10
After incrementing, we have 5 and 4
Now, we have 6 and 12
Finally, we have 15 and 11
\$

Let’s work this through a piece at a time. First we set up our variables, giving the values 4 and 10 to \$a and \$b respectively:

\$a = 4;
\$b = 10;
print "Our variables are ", \$a, " and ", \$b, "n";

In the following line, the assignment happens before the increment—this is known as a post-increment. So \$b is set to \$a ’s current value, 4 , and then \$a is autoincremented, becoming 5 .

\$b = \$a++;
print "After incrementing, we have ", \$a, " and ", \$b, "n";

In the next line however, the incrementing takes place first—this is known as a pre-increment. \$a is now 6, and \$b is set to twice that, 12.

\$b= ++\$a * 2;
print "Now, we have ", \$a, " and ", \$b, "n";

Finally, \$b is decremented first (a pre-decrement), and becomes 11. \$a is set to \$b plus 4, which is 15.

\$a= –\$b + 4;
print "Finally, we have ", \$a, " and ", \$b, "n";

The autoincrement operator actually does something interesting if the variable contains a string of only alphabetic characters, followed optionally by numeric characters. Instead of converting to a number, Perl “advances” the variable along the ranges a–z, A–Z, and 0–9. This is more easily understood from a few examples:

#!/usr/bin/perl -w
# auto2.pl

\$a = "A9"; print ++\$a, "n";
\$a = "bz"; print ++\$a, "n";
\$a = "Zz"; print ++\$a, "n";
\$a = "z9"; print ++\$a, "n";
\$a = "9z"; print ++\$a, "n";

should produce

\$ perl auto2.pl
B0
ca
AAa
aa0
10
\$

This shows that a 9 turns into a 0 and increments the next digit left. A z turns into an a and increments the next digit left, and if there are no more digits to the left, either an a or an A is created depending on the case of the current leftmost digit.

Please check back next week for the conclusion to this article series.

[gp-comments width="770" linklove="off" ]