Home arrow Perl Programming arrow Page 3 - Quantifiers and Other Regular Expression Basics

Posix Character Classes - Perl

In this second part of a four-part series on parsing and regular expression basics in Perl, you'll learn about quantifiers, modifiers, and more. This article is excerpted from chapter one of the book Pro Perl Parsing, written by Christopher M. Frenz (Apress; ISBN: 1590595041).

TABLE OF CONTENTS:
  1. Quantifiers and Other Regular Expression Basics
  2. Predefined Subpatterns
  3. Posix Character Classes
  4. Modifiers
By: Apress Publishing
Rating: starstarstarstarstar / 2
May 27, 2010

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

In the previous section, you saw the classic predefined Perl patterns, but more recent versions of Perl also support some predefined subpattern types through a set of Posix character classes. Table 1-3 summarizes these classes, and I outline their usage after the table.

Table 1-3. Posix Character Classes 

Posix Class

Pattern

[:alnum:]

Any letter or digit

[:alpha:]

Any letter

[:ascii:]

Any character with a numeric encoding from 0 to 127

[:cntrl:]

Any character with a numeric encoding less than 32

[:digit:]

Any digit from 0 to 9 (\d)

[:graph:]

Any letter, digit, or punctuation character

Table 1-3. Posix Character Classes (continued)

Posix Class Pattern
[:lower:] Any lowercase letter
[:print:]

Any letter, digit, punctuation, or space character

[:punct:] Any punctuation character
[:space:] Any space character (\s)
[:upper:] Any uppercase letter
[:word:] Underline or any letter or digit
[:xdigit:]

Any hexadecimal digit (that is, 09, af, or AF)

 


Note  You can use Posix characters in conjunction with Unicode text. When doing this, however, keep in mind that using a class such as[:alpha:]may return more results than you expect, since under Unicode there are many more letters than under ASCII. This likewise holds true for other classes that match letter and digits.


The usage of Posix character classes is actually similar to the previous examples where a range of characters was defined, such as[A-F], in that the characters must be enclosed in brackets. This is actually sometimes a point of confusion for individuals who are new to Posix character classes, because, as you saw in Table 1-3, all the classes already have brackets. This set of brackets is actually part of the class name, not part of the Perl regex. Thus, you actually need a second set, such as in the following regular expression, which will match any number of digits:

/[[:digit:]]*/



 
 
>>> More Perl Programming Articles          >>> More By Apress Publishing
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PERL PROGRAMMING ARTICLES

- Perl Turns 25
- Lists and Arguments in Perl
- Variables and Arguments in Perl
- Understanding Scope and Packages in Perl
- Arguments and Return Values in Perl
- Invoking Perl Subroutines and Functions
- Subroutines and Functions in Perl
- Perl Basics: Writing and Debugging Programs
- Structure and Statements in Perl
- First Steps in Perl
- Completing Regular Expression Basics
- Modifiers, Boundaries, and Regular Expressio...
- Quantifiers and Other Regular Expression Bas...
- Parsing and Regular Expression Basics
- Hash Functions

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: