Parsing Strings and Regular Expressions

In this fourth part of a five-part series on strings and regular expressions in PHP, you’ll learn how to perform complex string parsing, find the last occurrence of a string, and more. This article is excerpted from chapter nine of the book Beginning PHP and Oracle: From Novice to Professional, written by W. Jason Gilmore and Bob Bryla (Apress; ISBN: 1590597702).

Converting an Array into a String

Just as you can use the explode() function to divide a delimited string into various array elements, you concatenate array elements to form a single delimited string using the implode() function. Its prototype follows:

string implode(string delimiter, array pieces)

This example forms a string out of the elements of an array:

<?php
    $cities = array("Columbus", "Akron", "Cleveland", "Cincinnati");
    echo implode("|", $cities);
?>

This returns the following:

——————————————–
Columbus|Akron|Cleveland|Cincinnati
——————————————–

Performing Complex String Parsing

The strpos() function finds the position of the first case-sensitive occurrence of substr in a string. Its prototype follows:

int strpos(string str, string substr [, int offset])

The optional input parameter offset specifies the position at which to begin the search. If substr is not in str , strpos() will return FALSE . The optional parameter offset determines the position from which strpos() will begin searching. The following example determines the timestamp of the first time index.html is accessed:

<?php
    $substr = "index.html";
    $log = <<< logfile
    192.168.1.11:/www/htdocs/index.html:[2006/02/10:20:36:50]
    192.168.1.13:/www/htdocs/about.html:[2006/02/11:04:15:23]
    192.168.1.15:/www/htdocs/index.html:[2006/02/15:17:25]
logfile;
    // What is first occurrence of the time $substr in log?
    $pos = strpos($log, $substr);

    // Find the numerical position of the end of the line
    $pos2 = strpos($log,"n",$pos);

    // Calculate the beginning of the timestamp
    
$pos = $pos + strlen($substr) + 1;

    // Retrieve the timestamp
    $timestamp = substr($log,$pos,$pos2-$pos);

    echo "The file $substr was first accessed on: $timestamp";
?>

This returns the position in which the file index.html is first accessed:

——————————————–
The file index.html was first accessed on: [2006/02/10:20:36:50]

——————————————–

The function stripos() operates identically to strpos() , except that it executes its search case insensitively.

{mospagebreak title=Finding the Last Occurrence of a String}

The strrpos() function finds the last occurrence of a string, returning its numerical position. Its prototype follows:

int strrpos(string str, char substr [, offset])

The optional parameter offset determines the position from which strrpos() will begin searching. Suppose you wanted to pare down lengthy news summaries, truncating the summary and replacing the truncated component with an ellipsis. However, rather than simply cut off the summary explicitly at the desired length, you want it to operate in a user-friendly fashion, truncating at the end of the word closest to the truncation length. This function is ideal for such a task. Consider this example:

<?php
    // Limit $summary to how many characters?
    $limit = 100;

    $summary = <<< summary
    In the latest installment of the ongoing Developer.com PHP series,
    I discuss the many improvements and additions to
    <a href="http://www.php.net">PHP 5′s</a> object-oriented
    architecture.
summary;

    if (strlen($summary) > $limit)
        $summary = substr($summary, 0, strrpos(substr($summary, 0, $limit),
               ‘ ‘)) . ‘…’;
    echo $summary;
?>

This returns the following:

——————————————–
In the latest installment of the ongoing Developer.com PHP series,
I discuss the many…
——————————————–

Replacing All Instances of a String with Another String

The str_replace() function case sensitively replaces all instances of a string with another. Its prototype follows:

mixed str_replace(string occurrence, mixed replacement, mixed str [, int count])

If occurrence is not found in str , the original string is returned unmodified. If the optional parameter count is defined, only count occurrence s found in str will be replaced.

This function is ideal for hiding e-mail addresses from automated e-mail address retrieval programs:

<?php
   
$author = jason@example.com;
    $author = str_replace("@","(at)",$author);
   
echo "Contact the author of this article at $author.";
?>

This returns the following:

——————————————–
Contact the author of this article at jason(at)example.com.
——————————————–

The function str_ireplace() operates identically to str_replace() , except that it is capable of executing a case-insensitive search.

{mospagebreak title=Retrieving Part of a String}

The strstr() function returns the remainder of a string beginning with the first occurrence of a predefined string. Its prototype follows:

string strstr(string str, string occurrence)

This example uses the function in conjunction with the ltrim() function to retrieve the domain name of an e-mail address:

<?php
   
$url = sales@example.com;
    echo ltrim(strstr($url, "@"),"@");
?>

This returns the following:

——————————————–
example.com
——————————————–

Returning Part of a String Based on Predefined Offsets

The substr() function returns the part of a string located between a predefined starting offset and length positions. Its prototype follows:

string substr(string str, int start [, int length])

If the optional length parameter is not specified, the substring is considered to be the string starting at start and ending at the end of str . There are four points to keep in mind when using this function:

  1. If start is positive, the returned string will begin at the start position of the string. 
     
  2. If start is negative, the returned string will begin at the length - start position of the string. 
     
  3. If length is provided and is positive, the returned string will consist of the characters between start and start + length . If this distance surpasses the total string length, only the string between start and the string’s end will be returned. 
     
  4. If length is provided and is negative, the returned string will end length characters from the end of str .

Keep in mind that start is the offset from the first character of str ; therefore, the returned string will actually start at character position start + 1. Consider a basic example:

<?php
   
$car = "1944 Ford";
   
echo substr($car, 5);
?>

This returns the following:

——————————————–
Ford
——————————————–

The following example uses the length parameter:

<?php
    $car = "1944 Ford";
    echo substr($car, 0, 4);
?>

This returns the following:

——————————————–
1944
——————————————–

The final example uses a negative length parameter:

<?php
   
$car = "1944 Ford";
   
$yr = echo substr($car, 2, -5);
?>

This returns the following:

——————————————–
44
——————————————–

{mospagebreak title=Determining the Frequency of a String’s Appearance}

The substr_count() function returns the number of times one string occurs within another. Its prototype follows:

int substr_count(string str, string substring)

The following example determines the number of times an IT consultant uses various buzzwords in his presentation:

<?php
    $buzzwords = array("mindshare", "synergy", "space");

    $talk = <<< talk
    I’m certain that we could dominate mindshare in this space with
    our new product, establishing a true synergy between the marketing
    and product development teams. We’ll own this space in three months.
talk;

    foreach($buzzwords as $bw) {
        echo "The word $bw appears ".substr_count($talk,$bw)." time(s).<br />";
    }
?>

This returns the following:

——————————————–
The word mindshare appears 1 time(s).
The word synergy appears 1 time(s).
The word space appears 2 time(s).
——————————————–

Replacing a Portion of a String with Another String

The substr_replace() function replaces a portion of a string with a replacement string, beginning the substitution at a specified starting position and ending at a predefined replacement length. Its prototype follows:

string substr_replace(string str, string replacement, int start [, int length])

Alternatively, the substitution will stop on the complete placement of replacement in str . There are several behaviors you should keep in mind regarding the values of start and length :

  1. If start is positive, replacement will begin at character start
     
  2. If start is negative, replacement will begin at str length - start
     
  3. If length is provided and is positive, replacement will be length characters long. 
     
  4. If length is provided and is negative, replacement will end at str length - length characters.

Suppose you built an e-commerce site and within the user profile interface you want to show just the last four digits of the provided credit card number. This function is ideal for such a task:

<?php
   
$ccnumber = "1234567899991111";
   
echo substr_replace($ccnumber,"************",0,12);
?>

This returns the following:

——————————————–
************1111
——————————————–

Please check back next week for the conclusion to this article.

Google+ Comments

Google+ Comments