When you’re processing large amounts of information, the regular expression functions can slow matters dramatically. You should use these functions only when you are interested in parsing relatively complicated strings that require the use of regular expressions. If you are instead interested in parsing for simple expressions, there are a variety of predefined functions that speed up the process considerably. Each of these functions is described in this section. Tokenizing a String Based on Predefined Characters Thestrtok()function parses the string based on a predefined list of characters. Its prototype follows: string strtok(string str, string tokens) One oddity aboutstrtok()is that it must be continually called in order to completely tokenize a string; each call only tokenizes the next piece of the string. However, thestrparameter needs to be specified only once because the function keeps track of its position instruntil it either completely tokenizesstror a newstrparameter is specified. Its behavior is best explained via an example: <?php // delimiters include colon (:), vertical bar (|), and comma (,) // print out each element in the $tokenized array This returns the following: -------------------------------------------- Exploding a String Based on a Predefined Delimiter Theexplode()function divides the stringstrinto an array of substrings. Its prototype follows: array explode(string separator, string str [, int limit]) The original string is divided into distinct elements by separating it based on the character separator specified byseparator. The number of elements can be limited with the optional inclusion oflimit. Let’s useexplode()in conjunction withsizeof()andstrip_tags()to determine the total number of words in a given block of text: <?php This returns the following: -------------------------------------------- Theexplode()function will always be considerably faster thanpreg_split(),split(), andspliti(). Therefore, always use it instead of the others when a regular expression isn’t necessary. Note You might be wondering why the previous code is indented in an inconsistent manner. The multiple-line string was delimited using heredoc syntax, which requires the closing identifier to not be indented even a single space. Why this restriction is in place is somewhat of a mystery, although one would presume it makes the PHP engine’s job a tad easier when parsing the multiple-line string. See Chapter 3 for more information about heredoc. Please check back next week for the next part of this article.
blog comments powered by Disqus |
|
|
|
|
|
|
|