Array Manipulation in Perl

perlOver the course of this tutorial, I’ll be examining Perl’s arrays in detail, explaining what they are, how they work, and how you can use them to get things done faster, better and cheaper. In addition to providing a gentle introduction to Perl arrays and hashes in general, this article will also offer you a broad overview of Perl’s array manipulation functions, providing you with a handy reference that should help you write more efficient code.


Array of Light

If you’re like most Perl developers, you probably use arrays extensively in your development activities, as a convenient way to store related data values together. However, if you’re like most developers, it’s also quite likely that your knowledge of array manipulation techniques is limited to counting the elements of an array or iterating through key-value pairs. Although it might seem like sufficient for daily use, this limited knowledge can actually hamper your efficiency, by forcing you to write lines of code to perform tasks that could be handled more effectively through a built-in array function – all because you didn’t know better!

Well, it’s time to bring you into the light.

Over the course of this tutorial, I’ll be examining Perl’s arrays in detail, explaining what they are, how they work, and how you can use them to get things done faster, better and cheaper. In addition to providing a gentle introduction to Perl arrays and hashes in general, this article will also offer you a broad overview of Perl’s array manipulation functions, providing you with a handy reference that should help you write more efficient code.

{mospagebreak title=Back to Basics}

I’ll begin right at the top, with the answer to a basic question: what’s an array, anyhow?

In all programming languages, an array is a data structure that lets you store multiple values in a single variable. This structure is a very useful method of storing and representing related information. In Perl, array variables look like any other variable, with the exception that array variable names are always preceded by an @ symbol. Here’s an example:

[code]
#!/usr/bin/perl
# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");
[/code]

Thus the array variable @friends contains six elements, the names of the “Friends” crew.

The various elements of the array are accessed via an index number, with the first element starting at zero (this is sometimes referred to as “zero-based indexing”). So, to extract the first element of the array above, you’d use the notation $friends[0], while the notation $friends[5] would give you the sixth element of the array, “Ross”. Here’s a simple code snippet explaining this:

[code]
#!/usr/bin/perl
# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");

 

# prints "Phoebe"

print $friends[2];
[/code]

In case you don’t want to define the array all at once, you can define it incrementally, by assigning values to the array one at a time using the index notation above. For example, this line of code

[code]
#!/usr/bin/perl

# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");
[/code]

is equivalent to these:

[code]
#!/usr/bin/perl

# define array

$friends[0] = "Rachel";

$friends[1] = "Monica";

$friends[2] = "Phoebe";

$friends[3] = "Chandler";

$friends[4] = "Joey";

$friends[5] = "Ross";
[/code]

Notice that you don’t have to do anything special to bring an array variable into existence, like declare it with a keyword or instantiate an object to hold its values – Perl identifies the notation being used and creates the data structure appropriately.

{mospagebreak title=Hash Bang}

Perl also allows you to replace indices with user-defined “keys”, in order to create a slightly different type of array, called a “hash” or “associative array”. Each key is unique, and corresponds to a single value within the array.

[code]
#!/usr/bin/perl

# define hash

%dinner = ("starter" => "fried squid rings", "main" => "roast chicken", "dessert" => "chocolate cake");
[/code]

In this case, %dinner is an associative array variable containing key-value pairs. The % symbol before the variable name indicates that it is an associative array, while the => symbol is used to indicate the association between a key and its value.

Now, in order to access the value “fried squid rings”, I would use the key “starter” and the notation $dinner{“starter”}, while the value “chocolate cake” would be accessible via $dinner{“dessert”}. Here’s an example:

[code]
#!/usr/bin/perl

# define hash

%dinner = ("starter" => "fried squid rings", "main" => "roast chicken", "dessert" => "chocolate cake");

# prints "fried squid rings"

print $dinner{"starter"};
[/code]

Note the use of curly braces here around the key name, and contrast it with the use of square braces for the index notation used in regular arrays.

As with arrays, you can also use this notation to define an associative array incrementally – this line of code

[code]
#!/usr/bin/perl
# define hash

%dinner = ("starter" => "fried squid rings", "main" => "roast chicken", "dessert" => "chocolate cake");
[/code]

is equivalent to these:

[code]
#!/usr/bin/perl

# define hash

$dinner{"starter"} = "fried squid rings";

$dinner{"main"} = "roast chicken";

$dinner{"dessert"} = "chocolate cake";
[/code]

Note that when using an associative array, Perl usually re-sorts the key-value pairs such that they are optimized for retrieval. Therefore, it’s not a good idea to rely on your original element positions when accessing values in a hash; instead hash values should always be accessed by their respective keys.

{mospagebreak title=Harnessing Elements}

To modify a particular element of an array, use the index/key notation to accomplish your task, like this:

[code]
#!/usr/bin/perl

# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");

# now change one of its elements

$friends[3] = "Janice";

# array now looks like this

@friends = ("Rachel", "Monica", "Phoebe", "Janice", "Joey", "Ross");
[/code]

This works with associative arrays too:

[code]
#!/usr/bin/perl

# define hash

%dinner = ("starter" => "fried squid rings", "main" => "roast chicken", "dessert" => "chocolate cake");

 

# change element

$dinner{"dessert"} = "tiramisu";

 

# hash now looks like this

%dinner = ("starter" => "fried squid rings", "main" => "roast chicken", "dessert" => "tiramisu");
[/code]

You can print the contents of an array simply by using the array in a print() function call, as below:

[code]
#!/usr/bin/perl

# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");

# print contents

print "@friends ";
[/code]

The fact that array values can be accessed and manipulated using a numeric index makes them particularly well-suited for use in a loop. Consider the following example, which asks the user for a set of values (the user can define the size of the set), stores these values in an array, and then prints them back out.

[code]
#!/usr/bin/perl

# get array size

print ("How many items? ");

$size = ;

chomp ($size);

 

# get array values

for ($x=0; $x<$size; $x++)

        print ("Enter item ", ($x+1), ": "); 

        $val = ; 

        chomp ($val); 

        $data[$x] = $val;

}

 

# iterate over array

# print array values

print ("Here is what you entered: n");

for ($x=0; $x<$size; $x++)

        print ("Element $x: $data[$x]n");

}
[/code]

Here’s what the output looks like:

How many items? 3
Enter item 1: red
Enter item 2: orange
Enter item 3: green
Here is what you entered:
Element 0: red
Element 1: orange
Element 2: green

The first thing I’ve done here is ask the user for the number of items to be entered – this will tell me the size of the array. Once that information has been obtained, it’s pretty easy to set up a “for” loop to obtain that number of values through user input at the command prompt. Every time the user enters one of these values, a counter variable is incremented, this counter variable corresponds to the index in the @data array that is being constructed as the loop executes. Once all the values have been entered, another “for” loop is used to iterate over the newly-minted @data array, and print the values stored in it.

The second loop runs as many times as there are elements in the array. In this specific example, I know the size of the array because I had the user enter it at the beginning of the script, but you can also obtain the array size programmatically, as you’ll see on the next page.

{mospagebreak title=Looping the Loop}

Want to obtain the size of an array? Sure – just assign the array to a scalar variable and print the value of the scalar:

[code]
#!/usr/bin/perl

# define array

@friends = ("Rachel", "Monica", "Phoebe", "Chandler", "Joey", "Ross");

 

# obtain size of array

$count = @friends;

 

# print size and contents

print ("I have $count friends: @friends ");
[/code]

If you’re using a hash, the keys() and values() functions come in handy to get a list of all the keys and values within the array.

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

# returns an array of keys

@keys = keys(%matrix);

 

# returns an array of values

@values = values(%matrix);
[/code]

Most of the time, size and key information is used in the context of a loop that iterates over the array. Here’s an example of how this might work in a numerically-indexed array,

[code]
#!/usr/bin/perl

# define array

@ducks = ("Huey", "Dewey", "Louie");

 

# get array size

$size = @ducks;

print "And heeeeeeeeeeeeeere are the ducks: ";

 

# iterate over array

for ($i=0; $i<$size; $i++)

        print "$ducks[$i] ";

}
[/code]

and here’s an example of iterating over an associative array using the keys() function:

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

print "Say hello to the characters in The Matrix: n";

 

# get the keys of the array with keys()

# then use the keys to get the corresponding value

# in a loop

foreach $k (keys(%matrix))

        print $k, ": ", $matrix{$k}, "n";

}
[/code]

{mospagebreak title=A Difficult Assignment}

You can assign the elements of an array to scalar variables, as in the following example:

[code]
#!/usr/bin/perl

# define array

@human = ("John", "Doe");

 

# assign array contents to variables

($fname, $lname) = @human;

 

# print variables

print ("My name is $fname $lname");
[/code]

This won’t work with an associative array, though – for that, you need the each() function. Every time each() runs on a hash, it creates an array containing two elements: the hash key and the corresponding hash value.

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

# get first pair

($character, $name) = each (%matrix);

print "$character = $namen";

 

# get second pair

($character, $name) = each (%matrix);

print "$character = $namen";

 

# and so on...
[/code]

The each() function comes in particularly handy when you need to iterate through an associative array, as it is well-suited for use in a “while” loop:

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

# iterate through hash with each

# returns villain = smith hero = neo babe = trinity teacher = morpheus

while (($character, $name) = each (%matrix))

        print "$character = $namen";

}
[/code]

You can assign the array itself to another variable, thereby creating a copy of it, as below:

[code]
#!/usr/bin/perl

# define array

@john = ("John", "Doe");

 

# copy array

@clone = @john;

 

# print copy

print ("I am a clone named @clone");
[/code]

{mospagebreak title=Push and Pull}

You can add an element to the end of an existing array with the push() function,

[code]
#!/usr/bin/perl

# define array

@meals = ("lunch", "tea");

 

# add a new element to the end of the array

push (@meals, "dinner");

 

# array now looks like this

@meals = ("lunch", "tea", "dinner");
[/code]

and remove an element from the end with the pop() function.

[code]
#!/usr/bin/perl

# define array

@meals = ("lunch", "tea");

 

# remove an element from the end of the array

pop (@meals);

 

# array now looks like this

@meals = ("lunch");
[/code]

If you need to remove an element off the top of the array instead of the bottom, you need to use the shift() function,

[code]
#!/usr/bin/perl

# define array

@meals = ("lunch", "tea");

 

# remove an element from the beginning of the array

shift (@meals);

 

# array now looks like this

@meals = ("tea");
[/code]

while the unshift() function takes care of adding elements to the beginning of the array.

[code]
#!/usr/bin/perl

# define array

@meals = ("lunch", "tea");

 

# add an element to the beginning of the array

unshift (@meals, "breakfast");

 

# array now looks like this

@meals = ("breakfast", "lunch", "tea");
[/code]

When dealing with associative arrays, however, it’s not a good idea to use these functions, since key-value pairs in a Perl associative array are not always stored in the order in which they were defined. Therefore, to remove an element from an associative array, you should instead use the delete() function, as in the example below:

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

# delete key

delete ($matrix{"villain"});

 

# hash now looks like this

%matrix = ("hero" => "neo", "teacher" => "morpheus", "babe" => "trinity");
[/code]

You can delete all the values in a hash simply by combining the delete() function with a “foreach” loop, as below:

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

foreach $k (keys(%matrix))

        delete ($matrix{$k});

}
[/code]

The grep() function can tell you whether or not a particular value exists in an array. It accepts two arguments, a pattern to match and the array in which to search, and it scans every element of the named array for elements matching the pattern. Results, if any, are returned as another array. The following example illustrates how this works:

[code]
#!/usr/bin/perl

# define array

@tools = ("hammer", "chisel", "screwdriver", "boltcutter", "tape", "punch", "pliers");

 

# search for pattern "er" in array elements

@match = grep (/er/i, @tools);

 

# print matching elements

# returns "hammer screwdriver boltcutter pliers"

print "@match ";
[/code]

The exists() function is useful to check if a particular key exists in a hash.

[code]
#!/usr/bin/perl

# define hash

%matrix = ("hero" => "neo", "villain" => "smith", "teacher" => "morpheus", "babe" => "trinity");

 

# check for key existence

if (exists($matrix{"hero"}))

        print "Neo is alive!";

}
[/code]

{mospagebreak title=Slice and Dice}

Perl allows you to extract a subsection of an array – a so-called “array slice” – simply by specifying the index values needed in the slice. Consider the following example:

[code]
#!/usr/bin/perl

# define array

@rainbow = ("red", "green", "blue", "yellow", "orange", "violet", "indigo");

 

# extract slice "blue", "yellow"

@slice = @rainbow[2,3];
[/code]

Or how about this one?

[code]
#!/usr/bin/perl

# define array

@rainbow = ("red", "green", "blue", "yellow", "orange", "violet", "indigo");

 

# extract slice "blue", "violet", "red", "indigo"

@slice = @rainbow[2,5,0,6];
[/code]

You can also use a negative index for the “start” position, to force Perl to begin counting from the right instead of the left.

[code]
#!/usr/bin/perl

# define array

@rainbow = ("red", "green", "blue", "yellow", "orange", "violet", "indigo");

 

# extract slice "indigo", "red"

@slice = @rainbow[-1,-7];
[/code]

Perl also comes with a range operator (..) which provides an alternative way of extracting array slices. Here’s an example:

[code]
#!/usr/bin/perl

# define array

@rainbow = ("red", "green", "blue", "yellow", "orange", "violet", "indigo");

 

# extract elements 2 to 5

# slice contains "blue", "yellow", "orange", "violet"

@slice = @rainbow[2..5];
[/code]

You can also use the range operator to create arrays consisting of all the values in a range. For example, if you wanted an array consisting of the numbers between 1 and 20 (both inclusive), you could use the following code to generate it automatically:

[code]
#!/usr/bin/perl

# define array

@n = (1..20);
[/code]

The splice() function allows you to delete a specified segment of an array and splice in one or more values to replace it. Here’s what it looks like:

[code]
splice(array, start, length, replacement-values)
[/code]

where “array” is an array variable, “start” is the index to begin slicing at, “length” is the number of elements to remove from “start”, and “replacement-values” are the values to splice in.

Here’s an example:

[code]
#!/usr/bin/perl

# define array

@rainbow = ("red", "green", "blue");

 

# remove elements 1 and 2

# replace with new values

splice (@rainbow, 1, 2, "yellow", "orange");

 

# array now looks like this

@rainbow = ("red", "yellow", "orange");
[/code]

{mospagebreak title=Sorting Things Out}

You can alter the order of elements within an array with Perl’s various array-sorting functions. The simplest of these is the reverse() function, which merely reverses the order in which elements are stored within an array:

[code]
#!/usr/bin/perl

# define array

@stooges = ("larry", "curly", "moe");

 

# reverse it

@segoots = reverse(@stooges);

 

# reversed array now looks like this

@segoots = ("moe", "curly", "larry");
[/code]

The sort() function can be used to re-sort the elements in an array alphabetically:

[code]
#!/usr/bin/perl

# define array

@stooges = ("larry", "curly", "moe");

 

# sort it

@sorted = sort(@stooges);

 

# sorted array now looks like this

@sorted = ("curly", "larry", "moe");
[/code]

The split() function splits a string into smaller components on the basis of a user-specified pattern, and then returns these elements as an array.

[code]
#!/usr/bin/perl

$str = "I'm not as think as you stoned I am";

 

# split into individual words on whitespace delimiter

# and store in array @words

@words = split (/ /, $str);
[/code]

This function is particularly handy if you need to take a string containing a list of items (for example, a comma-delimited list) and separate each element of the list for further processing.

Here’s an example:

[code]
#!/usr/bin/perl

$str = "Rachel,Monica,Phoebe,Joey,Chandler,Ross";

 

# split into individual words and store in array

@arr = split (/,/, $str);

 

# print each element of array

foreach $item (@arr)

        print("$itemn");

}
[/code]

Obviously, you can also do the reverse – the join() function creates a single string from all the elements of an array, gluing them together with a user-defined separator. Reversing the example above, we have:

[code]
#!/usr/bin/perl

@arr = ("Rachel", "Monica", "Phoebe", "Joey", "Chandler", "Ross");

 

# create string from array

$str = join (" and ", @arr);

 

# returns "Rachel and Monica and Phoebe and Joey and Chandler
# and Ross are friends"

print "$str are friends";
[/code]

And that’s about all I have for the moment. I hope you enjoyed this article, and that it offered you some insight into the types of things you can do with Perl’s arrays. Should you require more information, try “man perlfunc” at your command prompt, or visit the “perlfunc” manual page on the Web. Until next time, stay healthy!

Note: All examples in this article have been tested on Perl 5.8.0. Examples are illustrative only, and are not meant for a production environment. Melonfire provides no warranties or support for the source code described in this article.

Google+ Comments

Google+ Comments