# Scalars

If you’re writing a program that involves computation, you’re going to come face to face with some basic programming concepts. This article, the first of a five-part series, will explain scalars, one of the basic ideas of programming in Perl. It is excerpted from chapter two of the book Beginning Perl, written by James Lee (Apress; ISBN: 159059391X).

The essence of programming is computation—we want the computer to do some work with the input (the data we give it). Very rarely do we write programs that tell us something we already know. Even more rarely do we write programs that do nothing interesting with our data at all. So, if we’re going to write programs that do more than say hello to us, we’re going to need to know how to perform computations, or operations on our data.

In this chapter, we will discuss several important basic ideas of programming in Perl:

1. Scalars: Single values, either numbers or strings.

2. Variables: A place to store a value.

3. Operators: Symbols such as + and – that operate upon data.

4. Reading data from the user: We will read from standard input, also known as th e keyboard.

Types of Data

A lot of programming jargon is about familiar words in an unfamiliar context. We’ve already seen a string, which was a series of characters. We could also describe that string as a scalar literal constant. What does that mean?

By calling a value a scalar, we’re describing the type of data it contains. If you remember your math (and even if you don’t), a scalar is a plain, simple, one-dimensional value. In math, the word is used to distinguish it from a vector, which is expressed as several numbers. Velocity, for example, has a pair of coordinates (speed and direction), and so must be a vector. In Perl, a scalar is the fundamental, basic unit of data of which there are two kinds—numbers and strings.

A literal is value that never changes. The value 5 is a scalar literal—and is literally 5; it can never be 4. Perl has three types of scalar literals: integers (such as 5), floating point numbers (like 3.14159), and strings (for example “hello, world”). To put it another way, a literal is a constant—it never changes.

As opposed to a variable which is a piece of memory that can hold a scalar value. Variables are so named because the value stored within them can vary. For instance, \$number can be assigned 5, and then later can be changed to the value 6. We will talk more about variables later in this chapter.

{mospagebreak title=Numbers}

There are two types of number that we are interested in as Perl programmers: integers and floating point numbers. The latter we’ll come to in a minute, but let’s work a bit with integers right now. Integers are whole numbers with no numbers after the decimal point like 42, –1, or 10. The following program prints a couple of integer literals in Perl.

#!/usr/bin/perl - w
# number1.pl

print 25, -4;

\$ perl number1.pl
25-4\$

Well, that’s what you see, but it’s not exactly what we want. Fortunately, this is pretty easy to fix. Firstly, we didn’t tell Perl to separate the numbers with a space, and secondly, we didn’t tell it to put a new line on the end. Let’s change the program so it does that:

#!/usr/bin/perl -w
# number2.pl

print 25, " ", -4, "n";

This will do what we were thinking of:

\$ perl number2.pl
25 -4
\$

For the purpose of human readability, we often write large integers such as 10000000 by splitting up the number with commas: 10,000,000. This is sometimes known as chunking. While we might write 10 million with a comma if we wrote a check for that amount, don’t use the comma to chunk in a Perl program. Instead, use the underscore: 10_000_000. Change your program to look like the following:

#!/usr/bin/perl -w
# number3.pl

print 25_000_000, " ", -4, "n";

Notice that those underscores don’t appear in the output:

\$ perl number3.pl
25000000 -4
\$

As well as integers, there’s another type of number—floating point numbers. These contain everything else, like 0.5, –0.01333, and 1.1.

Note that floating point numbers are accurate to a certain number of digits. For instance, the number 15.39 may in fact be stored in memory as 15.3899999999999. This is accurate enough for most scientists, so it will have to be for us programmers as well.

Here is an example of printing the approximate value of pi:

#!/usr/bin/perl -w
# number4.pl

print "pi is approximately: ", 3.14159, "n";

Executing this program produces the following result:

\$ perl number4.pl
pi is approximately: 3.14159
\$

{mospagebreak title=Binary, Hexadecimal, and Octal Numbers}

As we saw in the previous chapter, we can express numbers as binary, hexadecimal, or octal numbers in our programs. Let’s look at a program to demonstrate how we use the various number systems. Type in the following code, and save it as goodnums.pl :

#!/usr/bin/perl -w
# goodnums.pl

print 255,        "n";
print 0377,       "n";
print 0b11111111, "n";
print 0xFF,       "n";

All of these are representations of the number 255, and accordingly, we get the following output:

\$ perl goodnums.pl
255
255
255
255
\$

When Perl reads your program, it reads and understands numbers in any of the allowed number systems, 0 for octal, 0b for binary, and 0x for hex.

What happens, you might ask, if you specify a number in the wrong system? Well, let’s try it out. Edit goodnums.pl to give you a new program, badnums.pl , that looks like this:

#!/usr/bin/perl -w

print  255,          "n";
print 0378,       "n";
print 0b11111112, "n";
print 0xFG,       "n";

Since octal digits only run from 0 to 7, binary digits from 0 to 1, and hex digits from 0 to F, none of the last three lines make any sense. Let’s see what Perl makes of it:

Bareword found where operator expected at badnums.pl line 7, near "0xFG"
(Missing operator before G?)
Illegal octal digit ’8′ at badnums.pl line 5, at end of line
Illegal binary digit ’2′ at badnums.pl line 6, at end of line
syntax error at badnums.pl line 7, near "0xFG"
Execution of badnums.pl aborted due to compilation errors.
\$

Now, let’s match those errors up with the relevant lines:

Illegal octal digit ’8′ at badnums.pl line 5, at end of line

And line 5 is

print 0378,       "n";

As you can see, Perl thought it was dealing with an octal number, but then along came an 8, which stopped making sense, so Perl quite rightly complained. The same thing happened on the next line:

Illegal binary digit ’2′ at badnums.pl line 6, at end of line

And line 4 is

print 0b11111112, "n";

The problem with the next line is even bigger:

Bareword found where operator expected at badnums.pl line 7, near "0xFG"
(Missing operator before G?)
syntax error at badnums.pl line 7, near "0xFG"

The line starting “Bareword” is a warning (since we are using the -w option). Then it is followed by a syntax error. A bareword is a series of characters outside of a string that Perl doesn’t recognize. The word could mean a number of things, and Perl is usually quite good about knowing what you mean. In this case, the bareword was G : Perl had understood 0xF , but couldn’t see how the G fitted in. We might have wanted an operator do something with it, but there was no operator there. In the end, Perl gave us a syntax error, which is the equivalent of it giving up and saying, “How do you expect me to understand this?”

{mospagebreak title=Strings}

The other type of scalar available to us is the string, and we’ve already seen a few examples of them. In the last chapter, we met the string "Hello, world!n" . A string is a series of charac ters surrounded by some sort of quotation marks. Strings can contain ASCII (or Unicode) data and escape sequences such as the n of our example, and there is no maximum length restriction on a string imposed by Perl. Practically speaking there is a limit imposed by the amount of memory in your computer, but it’s quite hard to hit.

Single- vs Double-Quoted Strings

The quotation marks you choose for your string are significant. So far we’ve only seen double-quoted strings, like this: "Hello, world!n" . There is another type of string—one which has been single-quoted. Predictably, they are surrounded by single quotes: . The important difference is that no processing is done within single-quoted strings, except on \ and . We’ll also see later that variable names inside double-quoted strings are replaced by their contents, whereas single-quoted strings treat them as ordinary text. We call both these types of processing interpolation, and say that single-quoted strings are not interpolated.

Consider the following program, bearing in mind that t is the escape sequence that represents a tab:

#!/usr/bin/perl -w
# quotes.pl

print ‘tThis is a single quoted string.n’; print "tThis is a double quoted string.n";

The double-quoted string will have its escape sequences processed, and the single-quoted string will not. The output is

\$ perl quotes.pl
tThis is a single quoted string.n    This is a double quoted string.
\$

What do we do if we want to have a backslash in a string? This is a common concern for Windows users, as a Windows path looks something like this: C:WINNTProfiles . . . In a double-quoted string, a backslash will start an escape sequence, which is not what we want it to do.

There is, of course, more than one way to do it. We can either use a single-quoted string, as shown previously, or we can escape the backslash. One principle that we’ll see often in Perl, and especially when we get to regular expressions, is that we can use a backslash to turn off any special effect a character may have. This operation is called escaping, or more commonly, backwhacking.

In this case, we want to turn off the special effect a backslash has, and so we escape it:

#!/usr/bin/perl -w
# quotes2.pl

print "C:\WINNT\Profiles\n";
print ‘C:WINNTProfiles ‘, "n";

This prints the following:

\$ perl quotes2.pl
C:WINNTProfiles
C:WINNTProfiles
\$

Aha! Some of you may have got this message instead:

Can’t find string terminator " ‘ " anywhere before EOF at quotes2.pl line 5.

The reason for this is that you have probably left out the space character in line 5 before the second single quote. Remember that ‘ tells Perl to escape the single quote, and so it merrily heads off to look for the next quote, which of course is not there. Try this program to see how Perl treats these special cases:

#!/usr/bin/perl -w
# aside1.pl

print ‘ex\ er’ , ‘ ci’ se” , "n";

The output you get this time is

\$ perl aside1.pl
ex er ci’ se’
\$

Can you see how Perl did this? Well, we simply escaped the backslashes and single quotes. It will help you to sort out what is happening if you look at each element individually. Remember, there are three arguments in this example. Don’t let all the quotes confuse you.

Actually, there’s an altogether sneakier way of doing it. Internally, Windows allows you to separate paths in the Unix style with a forward slash, instead of a backslash. If you’re referring to directories in Perl on Windows, you may find it easier to say C:/WINNT/Profiles/ instead. This allows you to get the variable interpolation of double-quoted strings without the “Leaning Toothpick Syndrome” of multiple backslashes.

So much for backslashes, what about quotation marks? The trick is making sure Perl knows where the end of the string is. Naturally, there’s no problem with putting single quotes inside a double-quoted string, or vice versa:

#!/usr/bin/perl -w
# quotes3.pl

print "It’s as easy as that.n";
print ‘"Stop," he cried.’, "n";

This will produce the quotation marks in the right places:

\$ perl quotes3.pl
It’s as easy as that.
"Stop," he cried.
\$

The trick comes when we want to have double quotes inside a double-quoted string or single quotes inside a single-quoted string. As you might have guessed, though, the solution is to escape the quotes on the inside. Suppose we want to print out the following quote, including both sets of quotation marks:

‘"Hi," said Jack. "Have you read Slashdot today?"’

Here’s a way of doing it with a double-quoted string:

#!/usr/bin/perl -w
# quotes4.pl

print "’"Hi," said Jack. "Have you read Slashdot today?"’n";

Now see if you can modify this to make it a single-quoted string—don’t forget that n needs to go in separate double quotes to make it interpolate.

q// and qq//

It would be nice if you could select a completely different set of quotes so that there would be no ambiguity and no need to escape any quotes inside the text. The first operators we’re going to meet are the quote-like operators that do this for us. They’re written as q// and qq// , the first acting like a single-quoted string, and the second like a double-quoted string. Now instead of the preceding, we can write

#!/usr/bin/perl -w
# quotes5.pl

print qq/’"Hi," said Jack. "Have you read Slashdot today?"’n/;