Home arrow PHP arrow Converting Strings and Regular Expressions

Converting Strings and Regular Expressions

In this third part of a five-part series on strings and regular expressions in PHP, you will learn how to convert strings to and from HTML, and more. This article is excerpted from chapter nine of the book Beginning PHP and Oracle: From Novice to Professional, written by W. Jason Gilmore and Bob Bryla (Apress; ISBN: 1590597702).

TABLE OF CONTENTS:
  1. Converting Strings and Regular Expressions
  2. Using Special HTML Characters for Other Purposes
  3. Creating a Customized Conversion List
  4. Alternatives for Regular Expression Functions
By: Apress Publishing
Rating: starstarstarstarstar / 2
July 01, 2010

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

Converting Strings to and from HTML

Converting a string or an entire file into a form suitable for viewing on the Web (and vice versa) is easier than you would think. Several functions are suited for such tasks, all of which are introduced in this section.

Converting Newline Characters to HTML Break Tags

Thenl2br()function converts all newline (\n) characters in a string to their XHTML-compliant equivalent,<br />. Its prototype follows:

string nl2br(string str)

The newline characters could be created via a carriage return, or explicitly written into the string. The following example translates a text string to HTML format:

<?php
    $recipe = "3 tablespoons Dijon mustard
    1/3 cup Caesar salad dressing
    8 ounces grilled chicken breast
    3 cups romaine lettuce";

    // convert the newlines to <br />'s.
    echo nl2br($recipe);
?>

Executing this example results in the following output:

--------------------------------------------
3 tablespoons Dijon mustard<br />
1/3 cup Caesar salad dressing<br />
8 ounces grilled chicken breast<br />
3 cups romaine lettuce
--------------------------------------------

Converting Special Characters to their HTML Equivalents

During the general course of communication, you may come across many characters that are not included in a document’s text encoding, or that are not readily available on the keyboard. Examples of such characters include the copyright symbol (©), the cent sign (¢), and the grave accent (è). To facilitate such shortcomings, a set of universal key codes was devised, known as character entity references. When these entities are parsed by the browser, they will be converted into their recognizable counterparts. For example, the three aforementioned characters would be presented  as &copy;, &cent;, and &Egrave;, respectively.

To perform these conversions, you can use thehtmlentities()function. Its prototype follows:

string htmlentities(string str [, int quote_style [, int charset]])

Because of the special nature of quote marks within markup, the optionalquote_styleparameter offers the opportunity to choose how they will be handled. Three values are accepted:

ENT_COMPAT: Convert double quotes and ignore single quotes. This is the default.

ENT_NOQUOTES: Ignore both double and single quotes.

ENT_QUOTES: Convert both double and single quotes.

A second optional parameter,charset, determines the character set used for the conversion. Table 9-2 offers the list of supported character sets. Ifcharsetis omitted, it will default toISO-8859-1.

 

 

Table 9-2. htmlentities()’s Supported Character Sets

Character Set Description
BIG5 Traditional Chinese
BIG5-HKSCS

BIG5 with additional Hong Kong extensions, traditional Chinese

cp866 DOS-specific Cyrillic character set
cp1251 Windows-specific Cyrillic character set
cp1252 Windows-specific character set for Western Europe
EUC-JP Japanese
GB2312 Simplified Chinese
ISO-8859-1 Western European, Latin-1
ISO-8859-15 Western European, Latin-9
KOI8-R Russian
Shift-JIS Japanese
UTF-8 ASCII-compatible multibyte 8 encode

 

 

The following example converts the necessary characters for Web display:

<?php
    php $advertisement = "Coffee at 'Cafè Française' costs $2.25.";
    echo htmlentities($advertisement);
?>

This returns the following:

--------------------------------------------
Coffee at 'Caf&egrave; Fran&ccedil;aise' costs $2.25.

--------------------------------------------

Two characters are converted, the grave accent (è) and the cedilla (ç). The single quotes are ignored due to the defaultquote_stylesettingENT_COMPAT.



 
 
>>> More PHP Articles          >>> More By Apress Publishing
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PHP ARTICLES

- Hackers Compromise PHP Sites to Launch Attac...
- Red Hat, Zend Form OpenShift PaaS Alliance
- PHP IDE News
- BCD, Zend Extend PHP Partnership
- PHP FAQ Highlight
- PHP Creator Didn't Set Out to Create a Langu...
- PHP Trends Revealed in Zend Study
- PHP: Best Methods for Running Scheduled Jobs
- PHP Array Functions: array_change_key_case
- PHP array_combine Function
- PHP array_chunk Function
- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: