Home arrow PHP arrow Page 2 - Graphical Interfaces and Unit Testing

Testing the Word Class - PHP

In this final part of a three-part series on unit testing, we discuss the use of graphical interfaces, unit testing in a web environment, and more. The article is excerpted from chapter six of the book Advanced PHP Programming, written by George Schlossnagle (Sams; ISBN: 0672325616).

TABLE OF CONTENTS:
  1. Graphical Interfaces and Unit Testing
  2. Testing the Word Class
  3. Bug Report 1
  4. Unit Testing in a Web Environment
By: Sams Publishing
Rating: starstarstarstarstar / 2
November 02, 2006

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

Let's start by writing a test to count the number of syllables in a word:

<?php
require "PHPUnit/Framework/TestSuite.php"; 
require "PHPUnit/TextUI/TestRunner.php";
require "Text/Word.inc"; 
class Text_WordTestCase extends
PHPUnit_Framework_TestCase { public $known_words = array( 'the' => 1, 'late' => 1, 'frantic' => 2, 'programmer' => 3); public function _ _construct($name) { parent::_ _construct($name); } public function testKnownWords() { foreach ($this->known_words as $word =>
$syllables) { $obj = new Text_Word($word); $this->assertEquals($syllables,
$obj->numSyllables()); } } } $suite = new
PHPUnit_Framework_TestSuite('Text_WordTestCase'); PHPUnit_TextUI_TestRunner::run($suite); ?>

Of course this test immediately fails because you don't even have a Word class, but you will take care of that shortly. The interface used for Word is just what seemed obvious. If it ends up being insufficient to count syllables, you can expand it.

The next step is to implement the class Word that will pass the test:

<?php
class Text_Word { 
public $word; 
public function _ _construct($name) { 
$this->word = $name; 
} 
protected function mungeWord($scratch) { 
// lower case for simplicity 
$scratch = strtolower($scratch);
return $scratch; 
} 
protected function numSyllables() { 
$scratch = mungeWord($this->word); 
// Split the word on the vowels. a e i o u,
and for us always y $fragments = preg_split("/[^aeiouy]+/",
$scratch); // Clean up both ends of our array if they have
null elements if(!$fragments[0]) { array_shift($fragments); } if (!$fragments[count($fragments) - 1]) { array_pop($fragments); } return count($fragments); } } ?>

This set of rules breaks for late. When an English word ends in an e alone, it rarely counts as a syllable of its own (in contrast to, say, y, or ie). You can correct this by removing a trailing e if it exists. Here's the code for that:

function mungeWord($scratch) { 
$scratch = strtolower($scratch);
$scratch = preg_replace("/e$/", "", $scratch);
return $scratch; 
}

The test now breaks the, which has no vowels left when you drop the trailing e. You can handle this by ensuring that the test always returns at least one syllable. Here's how:

function numSyllables() { 
$scratch = mungeWord($this->word); 
// Split the word on the vowels. a e i o u, and
for us always y $fragments = preg_split("/[^aeiouy]+/",
$scratch); // Clean up both ends of our array if they have
null elements if(!$fragments[0]) { array_shift($fragments); } if (!$fragments[count($fragments) - 1]) { array_pop($fragments); } if(count($fragments)) { return count($fragments); } else { return 1; } }

When you expand the word list a bit, you see that you have some bugs still, especially with nondiphthong multivowel sounds (such as ie in alien and io in biography). You can easily add tests for these rules:

<?php
require_once "Text/Word.inc";
require_once "PHPUnit/Framework/TestSuite.php";
class Text_WordTestCase extends
PHPUnit_Framework_TestCase { public $known_words = array( 'the' => 1, 'late' => '1', 'hello' => '2', 'frantic' => '2', 'programmer' => '3'); public $special_words = array ( 'absolutely' => 4, 'alien' => 3, 'ion' => 2, 'tortion' => 2, 'gracious' => 2, 'lien' => 1, 'syllable' => 3); function _ _construct($name) { parent::_ _construct($name); } public function testKnownWords() { foreach ($this->known_words as
$word => $syllables) { $obj = new Text_Word($word); $this->assertEquals($syllables,
$obj->numSyllables(), "$word has incorrect syllable count"); } } public function testSpecialWords() { foreach ($this->special_words as
$word => $syllables) { $obj = new Text_Word($word); $this->assertEquals($syllables,
$obj->numSyllables(), "$word has incorrect syllable count"); } } } if(realpath($_SERVER['PHP_SELF']) == _ _FILE_ _) { require_once "PHPUnit/TextUI/TestRunner.php"; $suite = new
PHPUnit_Framework_TestSuite('Text_WordTestCase'); PHPUnit_TextUI_TestRunner::run($suite); } ?>

This is what the test yields now:

PHPUnit 1.0.0-dev by Sebastian Bergmann.
..F
Time: 0.00660002231598
There was 1 failure:
1) TestCase text_wordtestcase->testspecialwords()
failed: absolutely has incorrect syllable count expected 4, actual 5 FAILURES!!! Tests run: 2, Failures: 1, Errors: 0.

To fix this error, you start by adding an additional check to numSyllables() that adds a syllable for the io and ie sounds, adds a syllable for the two-syllable able, and deducts a syllable for the silent e in absolutely. Here's how you do this:

<?
function countSpecialSyllables($scratch) {
$additionalSyllables = array( '/\wlien/', // alien
but not lien '/bl$/', // syllable '/io/', // biography ); $silentSyllables = array( '/\wely$/',
// absolutely but not ely ); $mod = 0; foreach( $silentSyllables as $pat ) { if(preg_match($pat, $scratch)) { $mod--; } } foreach( $additionalSyllables as $pat ) { if(preg_match($pat, $scratch)) { $mod++; } } return $mod; } function numSyllables() { if($this->_numSyllables) { return $this->_numSyllables; } $scratch = $this->mungeWord($this->word); // Split the word on the vowels. a e i o u, and
for us always y $fragments = preg_split("/[^aeiouy]+/", $scratch); if(!$fragments[0]) { array_shift($fragments); } if(!$fragments[count($fragments) - 1]) { array_pop($fragments); } $this->_numSyllables +=
$this->countSpecialSyllables($scratch); if(count($fragments)) { $this->_numSyllables += count($fragments); } else { $this->_numSyllables = 1; } return $this->_numSyllables; } ?>

The test is close to finished now, but tortion and gracious are both two-syllable words. The check for io was too aggressive. You can counterbalance this by adding -ion and -iou to the list of silent syllables:

function countSpecialSyllables($scratch) { 
$additionalSyllables = array( '/\wlien/', // alien
but not lien '/bl$/', // syllable '/io/', // biography ); $silentSyllables = array( '/\wely$/',
// absolutely but not ely '/\wion/', // to counter the io match '/iou/', ); $mod = 0; foreach( $silentSyllables as $pat ) { if(preg_match($pat, $scratch)) { $mod--; } } foreach( $additionalSyllables as $pat ) { if(preg_match($pat, $scratch)) { $mod++; } } return $mod; }

The Word class passes the tests, so you can proceed with the rest of the implementation and calculate the number of words and sentences. Again, you start with a test case:

<?php
require_once "PHPUnit/Framework/TestCase.php";
require_once "Text/Statistics.inc";
class TextTestCase extends
PHPUnit_Framework_TestCase { public $sample; public $object; public $numSentences; public $numWords; public $numSyllables; public function setUp() { $this->sample = " Returns the number of words in the analyzed text
file or block. A word must consist of letters a-z with at least
one vowel sound, and optionally an apostrophe or a hyphen."; $this->numSentences = 2; $this->numWords = 31; $this->numSyllables = 45; $this->object =
new Text_Statistics($this->sample); } function _ _construct($name) { parent::_ _construct($name); } function testNumSentences() { $this->assertEquals($this->numSentences,
$this->object->numSentences); } function testNumWords() { $this->assertEquals($this->numWords,
$this->object->numWords); } function testNumSyllables() { $this->assertEquals($this->numSyllables,
$this->object->numSyllables); } } if(realpath($_SERVER['PHP_SELF']) == _ _FILE_ _) { require_once "PHPUnit/Framework/TestSuite.php"; require_once "PHPUnit/TextUI/TestRunner.php"; $suite = new
PHPUnit_Framework_TestSuite('TextTestCase'); PHPUnit_TextUI_TestRunner::run($suite); } ?>

You've chosen tests that implement exactly the statistics you need to be able to calculate the Flesch score of a text block. You manually calculate the "correct" values, for comparison against the soon-to-be class. Especially with functionality such as collecting statistics on a text document, it is easy to get lost in feature creep. With a tight set of tests to code to, you should be able to stay on track more easily.

Now let's take a first shot at implementing the Text_Statistics class:

<?php
require_once "Text/Word.inc";
class Text_Statistics {
public $text = '';
public $numSyllables = 0;
public $numWords = 0;
public $uniqWords = 0;
public $numSentences = 0;
public $flesch = 0;
public function _ _construct($block) {
$this->text = $block;
$this->analyze();
}
protected function analyze() {
$lines = explode("\n", $this->text) ;
foreach($lines as $line) {
$this->analyze_line($line);
}
$this->flesch = 206.835 -
(1.015 * ($this->numWords /
$this->numSentences)) - (84.6 * ($this->numSyllables /
$this->numWords)); } protected function analyze_line($line) { preg_match_all("/\b(\w[\w'-]*)\b/", $line,
$words); foreach($words[1] as $word) { $word = strtolower($word); $w_obj = new Text_Word($word); $this->numSyllables += $w_obj->numSyllables(); $this->numWords++; if(!isset($this->_uniques[$word])) { $this->_uniques[$word] = 1; } else { $this->uniqWords++; } } preg_match_all("/[.!?]/", $line, $matches); $this->numSentences += count($matches[0]); } } ?>

How does this all work? First, you feed the text block to the analyze method. analyze uses the explode method on the newlines in the document and creates an array, $lines, of all the individual lines in the document. Then you call analyze_line() on each of those lines. analyze_line() uses the regular expression /\b(\w[\w'-]*)\b/ to break the line into words. This regular expression matches the following:

\b       # a zero-space word break
(        # start capture
\w       # a single letter or number 
[\w'-]*  # zero or more alphanumeric characters
plus 's or s # (to allow for hyphenations and
contractions ) # end capture, now $words[1] is our
captured word \b # a zero-space word break

For each of the words that you capture via this method, you create a Word object and extract its syllable count. After you have processed all the words in the line, you count the number of sentence-terminating punctuation characters by counting the number of matches for the regular expression /[.!?]/.

When all your tests pass, you're ready to push the code to an application testing phase. Before you roll up the code to hand off for quality assurance, you need to bundle all the testing classes into a single harness. With PHPUnit::TestHarness, which you wrote earlier, this is a simple task:

<?php
require_once "TestHarness.php";
require_once "PHPUnit/TextUI/TestRunner.php";
$suite = new TestHarness();
$suite->register("Text/Word.inc");
$suite->register("Text/Statistics.phpt");
PHPUnit_TextUI_TestRunner::run($suite);
?>

In an ideal world, you would now ship your code off to a quality assurance team that would put it through its paces to look for bugs. In a less perfect world, you might be saddled with testing it yourself. Either way, any project of even this low level of complexity will likely have bugs.



 
 
>>> More PHP Articles          >>> More By Sams Publishing
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PHP ARTICLES

- Hackers Compromise PHP Sites to Launch Attac...
- Red Hat, Zend Form OpenShift PaaS Alliance
- PHP IDE News
- BCD, Zend Extend PHP Partnership
- PHP FAQ Highlight
- PHP Creator Didn't Set Out to Create a Langu...
- PHP Trends Revealed in Zend Study
- PHP: Best Methods for Running Scheduled Jobs
- PHP Array Functions: array_change_key_case
- PHP array_combine Function
- PHP array_chunk Function
- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: