HomeMySQL Page 6 - Optimizing Queries with Operators, Branching and Functions, continued
Pattern-Matching and Regular Expressions - MySQL
This article will give you a good grounding in operators, branching and functions in MySQL, so you can make the database, instead of your own code, do the bulk of the work. It is the second of three parts, and excerpted from chapter four of Beginning MySQL Database Design and Optimization: From Novice to Professional, written by Jon Stephens and Chad Russell (Apress; ISBN: 1590593324).
An often-overlooked feature of MySQL is its ability to perform pattern-matching, including the use of using regular expressions. For simple pattern-matching, the LIKE operator is used, with the following syntax:
string LIKE pattern
This operator yields either a 1 (TRUE) or 0 (FALSE), depending on whether or not a match for pattern is found in string. The pattern argument supports two wildcards:
_ (underscore character), for any single character
% (percent sign), for any group of characters
These wildcards can be used at the beginning, end, or anywhere inside of pattern. Here are some examples:
NOTEUnlike some other databases, MySQL does not support the grouping([ ])or negation(^)operators for use with the LIKE operator.
MySQL also supports full-fledged regular expressions, similar to that implemented in a number of programming languages (using the POSIX 1003.2 standard syntax). A regular expression is a sequence of characters and/or special characters (known as pattern modifiers) that forms a pattern for which a match is sought in a string. In MySQL, you test for a match using the REGEXP operator and this syntax:
string REGEXP pattern
This will evaluate as true if a match is found, and false if it is not.
Regular expressions as used in MySQL accept the pattern modifiers shown in Table 4-2.
Table 4-2. MySQL Regular Expression Pattern Modifiers
*
MODIFIER*
DESCRIPTION
^
Outside any group, marks the beginning of the string to be searched. Within a group, negates the pattern; for example, [^a-e] matches any character except the letters a, b, c, d, or e
$
End of the string to be searched
.
Matches any single character
*
Zero or more occurrences of the preceding group (equivalent to {0, })
Table 4-2. MySQL Regular Expression Pattern Modifiers (Continued)
MODIFIER*
DESCRIPTION
+
One or more occurrences of the preceding group (equivalent to {1, })
?
Zero or one occurrence of the preceding group (equivalent to {0, 1})
|
Branch operator (OR); for example, dog|cat matches either dog or cat
()
Encloses an expression; for example, (aeiou) matches the sequence aeiou
[]
Encloses a set or range; for example, [1-4] matches any one of the digits 1, 2, 3, or 4; [c-g] matches any one of the letters c, d, e, f, or g; [aeiou] matches any one of the letters a, e, i, o, or u. It is possible to combine sets and ranges in a single group; for example, [0-4g] matches any one of the following: 0g, 1g, 2g, 3g, or 4g.
{}
Quantifies the preceding group; {N} indicates that the group must be repeated N times; {N, M} indicates that the group may be repeated from N to M times.
* To match a literal occurrence of any of the characters shown in the Modifier column, escape it with a backslash; for example, [why\?] matches the string “why?” (including the question mark). To match a backslash, use a double backslash: \\ .
The following are some examples of regular expressions.
MySQL doesn’t provide any direct means to return matched expressions (only that a match was or wasn’t found), but it’s still possible to use REGEXP in WHERE clauses to achieve the same effect, as in the next example.
The RLIKE operator is also supported as an alias to REGEXP , and it works in the same way.
TIP For complete details on using regular expressions, see the Unix regexp(7) man page. If you’re a Windows user, you can find this on the Web at http://www.unusualresearch.com/regex/ regexmanpage.htm.
Please check back next week for the conclusion of this article.