Another way that you can save on programming logic in MySQL-based applications, and gain optimization benefits, is by performing string manipulations as part of your queries. MySQL provides a large number of string operations and supports regular expressions as well. First, let’s look at some of the string functions.
MySQL has functions for joining strings, finding and extracting substrings, and performing other useful string operations. To obtain the length of a string (that is, the number of characters it contains), you can use the LENGTH() function for single-byte character sets. For multibyte characters, you must use CHAR_LENGTH() . If internationalization is likely to be a concern in your application, you should definitely use CHAR_LENGTH() .
For joining strings together, you might be tempted to use the + operator. Several programming languages allow you to do this, but when we try it in MySQL, here’s what happens:
In MySQL, the + operator can be used only for adding numbers, and (as we discussed earlier) any strings occurring in an expression involving addition are automatically converted to zero. Instead, use either CONCAT() or CONCAT_WS() .
Both of these functions concatenate strings, but in slightly different ways. CONCAT() takes two or more quoted strings separated by commas and returns a single string containing all of them joined together.
This is a bit better, but is probably not exactly what we had in mind in this case. One possible solution is to insert a new string containing a space in between the strings we’re already using, as shown in this example:
This is a little more like it. But there’s a better way: using the CONCAT_WS() function. The “WS” stands for “with separator” and indicates that the first argument is to be inserted as a separator or delimiter between each of the strings to be joined.
You can see another example of using CONCAT() in the “improved” PHP 4 and Python coding examples in the “Replacing Program Logic with SQL Logic: A Demonstration” section at the beginning of this chapter.
To extract substrings from strings, use SUBSTRING(), whose syntax can take either of the forms shown here:
SUBSTRING(string, position[, length])
This function returns the next length characters of string beginning with the character at position, with the first character in string at position 1. If length is
There are no differences in behavior between the forms using commas and those using FROM and (optionally) FOR. However, the latter syntax is SQL-92 standard and thus more portable between different databases. In the SQL-92 form, the FROM clause is required.
You can use a variation on this function, SUBSTRING_INDEX(), to indicate a starting point for a substring in terms of where a given delimiter occurs within the string. Although it’s not standard and not found in any other widely used database, it can be extremely handy on occasion. The syntax for calling this function is as follows:
SUBSTRING_INDEX(string, delimiter, index)
Here, string represents the string from which you wish to extract a substring, delimiter the delimiter string, and index how many occurrences of the delimiter to count. If index is positive, the delimiter count is from the left, and everything to the left of the final delimiter is returned; if it is negative, the delimiter count is from the right, and the substring returned is taken from the right of this delimiter, as you can see in the following examples.
Another way to extract a substring is by using the RIGHT() function. It takes two arguments—a string and an integer length—and returns the rightmost length characters from string.
For locating a substring within a given string, you can use LOCATE() , INSTR() , and POSITION() . LOCATE() has this form:
LOCATE(substring, string[, position])
The function returns the position at which the first occurrence of substring is found in string. If position is specified, the function returns the position at which the first occurrence of substring is found in string following position.
INSTR() works like LOCATE() with two arguments, except that the order of the arguments is reversed.
The POSITION() function works similarly, except that it has the following syntax:
POSITION(substring IN string)
Prior to MySQL 4.0, the functions LOCATE(), INSTR(), and POSITION() were case-sensitive. Since then, they are case-sensitive only if at least one of the string or substring arguments is a binary string.
To change the case of a string to uppercase, use UCASE() or UPPER() . To convert it to lowercase, use LCASE() or LOWER() .
TRIM([[LEADING | TRAILING | BOTH] [substring] FROM] string)
The best way to explain how this function works is simply to provide a few examples. If substring is not specified, a space character is assumed:
If LEADING, TRAILING, or BOTH is not specified, BOTH is assumed:
The substring may be any group of one or more characters:
LPAD() and RPAD() do more or less the opposite of LTRIM() and RTRIM(): they pad out a string to a given length with one or more padding characters. LPAD() and RPAD() have the same syntax:
LPAD(string, length, padchars)
Both functions pad the string to length using the padchars. The difference is that LPAD() adds characters to the left of the original string before returning the padded string, and RPAD() adds the padding characters to the right. The following are some examples using LPAD(); RPAD() works essentially the same way.
If length is less than the length of string, then string is truncated to length and returned.
To replace all occurrences of one substring with another within a given string, you can use the REPLACE() function, which has the following syntax:
REPLACE(string, substring_from, substring_to)
Here’s an example:
Finally, here are a few miscellaneous string functions you may find useful from time to time:
QUOTE(string): Returns a copy of string in which all quotation marks have been escaped.
REPEAT(string, count): Returns a new string containing string repeated count times. (Returns an empty string if count is less than 1.)
SPACE(count): Returns a string consisting of count spaces. (Returns an empty string if count is less than 1.)
STRCMP(string1, string2): Similar to the C language function of the same name, this returns –1 if string1 sorts before string2, 1 if the reverse is true, and 0 if the two strings are the same. It uses the current character set. This function is case-sensitive before MySQL 4.0; in MySQL 4.0 and later, it is case-insensitive unless at least one of the two strings is binary.
REVERSE(string): Returns a string with all of the characters in string in reverse order. For example, REVERSE ('tram') returns the string 'mart' .
We have not provided an exhaustive listing of all of the string functions supported in MySQL, but instead covered those we feel are likely to be useful in helping you move the processing of strings from your application code into your MySQL queries. For a complete list, see the MySQL Manual.
blog comments powered by Disqus