Hash Mania With Perl

Perl hashes are extremely useful data structures that allow us to associate one piece of data to another. In this article, Jasmine will review hashes and introduce some of their more advanced uses.

Perl hashes are extremely useful data structures that allow you to associate one piece of data (called a key) to another (its value). In this article I will review hashes and introduce some of the more advanced uses of hashes.{mospagebreak title=Assigning Key/Value Pairs&toc=1} Hashes consist of one or more individual keys and its associated value. Each key and value are called pairs. There are several ways to insert these pairs into hashes, outline below.

If you know (at least some of) the key/value pairs that you would like to use, the following is the most straightforward way to assign pairs to hashes:

[code] %hash = (
apples => 6,
oranges => 5,
pears => 3,
grapes => 2,
); [/code]

The above is a more readable way to assign key/value pairs. Let’s not forget the importance of having easy to read code. A less readable way to assign keys and values to hashes is below:

[code] %hash = qw(apples 6 oranges 5 pears 3 grapes 2); [/code]

Perl will automatically convert the above to key/value pairs as if you used the arrows => in the first example. We recommend the first format’s example for readability, though the formats can be used interchangeably.

You can also add each key/value pair individually. The following line adds a new key/value pair to our original hash.

[code] $hash{peach} = 3; [/code]

If the original hash did not exist, this line would have created a new hash and inserted the first key/value pair as defined. The process by which a variable can spring into life like this is called autovivification.

This is useful if you need to loop through a data file and would like to insert data from the file to a hash.

[code] open FILE, "fruits.txt" or die $!;
while (){
chomp;
my @line = split(/ /);
$hash{$line[0]} = $line[1];
}
close FILE or die $!; [/code]

Removing Pairs from Hashes
Now that we know how to add pairs to hashes, we need to know how to get rid of them. Deleting a pair is as easy as knowing the key of the pair you want deleted:

[code] delete $hash{peach}; [/code]

Now, the pair whose key is peach is gone. But what if you wanted to delete the entire hash? You can either loop through the entire hash and delete each key (inefficient) or you can undef it:

[code] undef %hash; [/code]

Do not use:

[code] %hash = undef; [/code]

This will not obliterate the hash, it will assign a single key/value pair of undef/undef. If you want to remove all keys from the hash, but still keep %hash as an “active” variable, use:

[code] %hash = (); [/code]

Looking inside Hashes
We now know how to add and remove pairs from hashes, but how to see what pairs are there? As with nearly everything Perl, TIMTOWTDI (there is more than one way to do it). Here, we’ll look at a few examples on how to loop inside hashes and take a peek at what’s inside. These examples assume you’re already familiar with loops.

Using foreach
[code] foreach my $key (keys %hash) {
print "$key = $hash{$key} ";
} [/code]

The my $key localizes the scalar to this loop (and prevents the “uninitialized variable” errors when running under warnings).

Using map
[code] print map "$_ = $hash{$_} ", keys %hash; [/code]


Using while/each
[code] while (($key,$value) = each %hash){
print "$key = $value ";
} [/code] {mospagebreak title=Sorting Hashes&toc=1} If you’ve actually tested the samples above, you’ll have noticed that the hashes printed out in seemingly random order. This is because hashes are stored based on memory location, not alphabetically or numerically. But have heart, it’s easy to sort hashes.

There are 3 ways to sort in Perl: ASCIIbetically, numerically or alphabetically.

Every character (number, letter or metacharacters) has an ASCII code associated with it. Letters have separate ASCII codes for each of the cases (upper and lower case). For example, the letter A is 065 and the letter a is 097. So A is “less than” a (065 < 097). with this in mind, let's create a simple hash that uses both upper and lower cases in its keys:

Apples => 1,
apples => 4,
artichokes => 3,
Beets => 9,
);

foreach my $key (sort keys %hash) {
print “$key = $hash{$key} “;
} [/code]

The above code will print:

Apples = 1
Beets = 9
apples = 4
artichokes = 3

Because the letter B is 066 in ASCII code, it is “less than” 097, the letter for A. This yields some strange results, but you may wish to use it one day :)

“To sort strings without regard to case, run $a and $b through lc before comparing:” using the cmp comparison operator. This tells Perl to sort letters and ignore case.

[code] foreach my $key (sort {lc($a) cmp lc($b)} keys %hash) {
print "$key = $hash{$key} ";
} [/code]

This correctly prints:

Apples = 1
apples = 4
artichokes = 3
Beets = 9

Hash Slices
“A slice is a section or number of elements from a list, array or hash.” Essentially, you can add or delete key/value pairs en masse using slices, which are named using the @ at symbol. To give an example of slices, consider the following:

[code] @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

@monthnums{@months}= 1..$#months+1; [/code]

Here, we’ve just created a hash named %monthsnum using a hash slice. It added each of the elements of the @months array as keys, and the values are 1 through 12 to each month. Because @months are in order, adding 1 through 12 assigns the correct month number value to each key.

So you’ve done the hash slice and now want to print out the results to make sure it’s correct.

[code] foreach my $key (sort {$nums{$a} <=> $nums{$b}} keys %nums){
print "$key = $nums{$key} ";
}; [/code]

Prints:

Jan = 1
Feb = 2
Mar = 3
Apr = 4
May = 5
Jun = 6
Jul = 7
Aug = 8
Sep = 9
Oct = 10
Nov = 11
Dec = 12

We’ve already seen how to sort hashes in Sorting Hashes above, and we’ve just added to it.

[code] sort {$nums{$a} <=> $nums{$b}} [/code]

sorts the hash numerically based on the values of the hash instead of the keys. This way, our months appear in the correct year order.

{mospagebreak title=Subbing Out Sorting&toc=1} Think it will introduce too many typos to remember the construct for alphabetical sorting? Then “sub” it! Let’s say you use sorting very frequently in your programming and find it cumbersome to type in the above operations each time.

Let’s handle this by example:

[code] foreach my $key (sort ascend_alpha keys %hash){
print "$key = $hash{$key} ";
} [/code]

You can easily see that the lc($a) cmp lc($b) has been replaced by a subroutine call. Now, let’s consider the following:

[code] sub ascend_num {$a <=> $b}
sub descend_num {$b <=> $a}
sub ascend_alpha {lc($a) cmp lc($b)}
sub descend_alpha {lc($b) cmp lc($a)}
sub ascend_alphanum {$a <=> $b || lc($a) cmp lc($b)}
sub descend_alphanum {$b <=> $a || lc($b) cmp lc($a)} [/code]

The ascend_alphanum and descend_alphanum routines sort both alphabetically and numerically, so if you added “5 Spice Seasoning”, “1 Star Flour”, and “911 Hot Sauce” to %hash, it will sort the numbers numerically in addition to letters alphabetically.

Apples = 1
apples = 4
artichokes = 3
Beets = 9
canadian = 9
5 Spice Seasoning = 1
10 Star Flour = 1
911 Hot Sauce = 1

Working with Hash References
Have a hash reference and don’t want to duplicate the subroutines to deal with them? It’s easy… just pass the dereferenced keys to the sort routines:

[code] $hashref = %hash;

foreach my $key (sort ascend_alpha keys %{$hashref}){
print "$key = $hashref->{$key} ";
} [/code]

Notice the hash deference %{$hashref} and the arrow dereferencer for the value $hashref->{$key}.

Sorting by Hash Values
What if you wanted to sort the values instead of the keys? Consider the following:

[code] foreach my $key (sort {$hash{$a} <=> $hash{$b}} keys %hash){
print "$key = $hash{$key} ";
} [/code]

This will print out:

Apples = 1
5 Spice Seasoning = 1
10 Star Flour = 1
APples = 2
artichokes = 3
apples = 4
911 Hot Sauce = 4
canadian = 9
Beets = 9

But let’s say you also had text values — let’s change $hash{‘Beets’} to “cans: 4 – 8.oz.” and $hash{‘Apples’} to “Delicious Red – 4 medium sized”. You can have the hash sorted numerically and alphabetically by using the following:

[code] foreach my $key (sort {$hash{$a} <=> $hash{$b} || $hash{$a} cmp $hash{$b}} keys %hash){
print "$key = $hash{$key} ";
} [/code]

This will correctly print out:

Beets = cans: 4 – 8.oz.
Apples = Delicious Red – 4 medium sized
5 Spice Seasoning = 1
10 Star Flour = 1
APples = 2
artichokes = 3
apples = 4
911 Hot Sauce = 4
canadian = 9

Multidimensional Hashes
Using key/value pairs is great, but what if you wanted to associate more than one value to a key? Using a slightly different construct, you can essentially use an array as a key’s value:

[code] %hash = (
Apples => [4, "Delicious red", "medium"],
"Canadian Bacon" => [1, "package", "1/2 pound"],
artichokes => [3, "cans", "8.oz."],
Beets => [4, "cans", "8.oz."],
"5 Spice Seasoning" => [1, "bottle", "3 oz."],
"10 Star Flour" => [1, "package", "16 oz."],
"911 Hot Sauce" => [1, "bottle", "8 oz."],
); [/code]

Now, to extract the values, you can treat them as an array of the hash:

[code] print $hash{"Canadian Bacon"}[1]; [/code]

…will print package, because package is the second element of Canadian Bacon’s “array”. You can also add an predefined array to a hash value:

[code] @garlicstuff = (4, "cloves", "medium");
$hash{"Garlic"} = [@garlicstuff];
print $hash{"Garlic"}[1]; # prints cloves [/code]

But what if @garlicstuff had more elements than others? Let’s say that @garlicstuff is

[code] @garlicstuff = (4, "cloves", "medium", "chopped"); [/code]

instead? How do we print out all values for a key if one key can have 3 values, and another has 4 (or more) values?

[code] foreach my $key (sort ascend_alpha keys %hash){
print "$key: ";
foreach my $val (@{$hash{$key}}){
print " $val ";
}
print " ";
} [/code]

Because a multidimensional array’s values are essentially arrays, a key’s group of values can be dereferenced by using @{$hash{$key}}. The above code prints:

10 Star Flour:
1
package
16 oz.

5 Spice Seasoning:
1
bottle
3 oz.

911 Hot Sauce:
1
bottle
8 oz.

Apples:
4
Delicious red
medium

artichokes:
3
cans
8.oz.

Beets:
4
cans
8.oz.

Canadian Bacon:
1
package
1/2 pound

Garlic:
4
cloves
medium
chopped
[gp-comments width="770" linklove="off" ]

chat sex hikayeleri Ensest hikaye