This time, Perl 101 visits some of Perl's more useful in-builtfunctions, and teaches you the basics of pattern matching and substitution.Also included is a list of common string and math functions, together withexamples of how to use them.
The example you just saw was pretty cut-and-dried - decide search term, decide replacement term, shove both into Perl script, bang bang and Bob's your uncle. But what happens if you need to replace a class of words, rather than a single word? If, for example, you need to replace not just the letter "a", but words beginning with one or more "a" and ending with a "k"...like "aardvark", for example?
The special characters you'll use to modify your pattern are called "metacharacters", which is another of those words that sounds impressive but means absolutely nothing useful. We'll explain some of them here, and point you to a resource for more information a little further down.
Two of the more useful metacharacters in Perl are the "+" and "*" characters, which match "one or more" instances and "zero or more" instances of the preceding pattern respectively.
For example,
/boo+/
would match "book", and "booo" but not "bottle", while
/bo*/
would match all of "bottle", "bog", "book", and "booo". One common use of Perl pattern-matching is to remove unnecessary blank spaces from a block of text. This is accomplished using the "\s" metacharacter, which matches whitespace, tab stops and newline characters. The simple substitution
/\s+/ /
would "substitute zero or more occurrences of whitespace with nothing"
Perl also has two important "anchor" characters, which allow you to build patterns which specify which characters come at the beginning or end of a string. The ^ character is used to match the beginning of a string, while the $ character is used to match the end. So,
/^h/
would match "hello" and "house", but not "shop"
while
/g$/
would match "dig" and "bog", but not "gold" or "eagle"
And you can even specify more than one pattern to match with the | operator.
/(the|a)/
would return true if the string contained either "the" or "a".
Obviously, regular expressions is a topic in itself - if you're interested, we've put together a well-written, comprehensive guide to the topic at So What's A $#!%% Regular Expression, Anyway?!
This article copyright Melonfire 2000. All rights reserved.