Home arrow PHP arrow Improve PHP Captcha with Optical Character Recognition Tests

Improve PHP Captcha with Optical Character Recognition Tests

If you're working on a captcha system for your PHP-based website, you may be faced with an interesting challenge. How do you make your system too hard for spam bots to read, but not too hard for humans? This is especially worrying in the wake of bots that can harness OCR for reading captchas. This article explains how to increase the difficulty of a captcha system and test it to make sure it meets your requirements.

TABLE OF CONTENTS:
  1. Improve PHP Captcha with Optical Character Recognition Tests
  2. Adding background noise to captcha
  3. Increasing captcha difficulty
  4. The final script
By: Codex-M
Rating: starstarstarstarstar / 4
August 02, 2010

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

There are many standard captcha solutions available that will work with PHP, including reCaptcha and Asirra captcha.

However, some PHP developers might need to work on their own captcha solution. They can gain overall flexibility and total control of the captcha solution. It also helps the PHP developer to learn the overall operating principle and development behind captcha (Completely Automated Public Turing test to tell Computers and Humans Apart). Gaining this level of understanding means that, if the captcha produces an undesirable result (such as being too difficult for humans), the developer can adjust it accordingly, without defeating the overall objective of separating bots from humans.

This article will focus on improving the difficulty of a PHP-based captcha system with optical character recognition test (OCR). Most spammers or anti-captcha spam bots use OCR technology to crack captchas. If your captcha system is easy, it is in fact worthless if OCR technology can crack it.

Presentation of the Problem

Consider the existing PHP captcha script  (antibot.php) below:

 

//Start PHP session

session_start();

 

//Generate random number

 

//Store generate random number to a session

 

 

//Create image 50 x 50 pixels

 

 

//Initial background and text color of the captcha image

 

 

//Write the string at the image

 

 

//Output the image

 

 

 

<form>

<!--Display the captcha image on the browser-->

<img src="antibot.php" />

 

<br />

 

Type the anti-bot code above:

 

<br /> <br />

 

<input type="text" name="captcha" size="10">

 

 

 

How easy is this captcha for a bot to defeat? I tested it with an excellent open source optical character recognition engine, Tesseract, which is also used by an online OCR tool. I took three image samples of this captcha output and then uploaded the image to the OCR. I obtained the following result:

 

This captcha can be broken perfectly by a very good OCR engine. If you use this captcha in your website, you risk being compromised by a spam bot using this OCR engine. 



 
 
>>> More PHP Articles          >>> More By Codex-M
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PHP ARTICLES

- Hackers Compromise PHP Sites to Launch Attac...
- Red Hat, Zend Form OpenShift PaaS Alliance
- PHP IDE News
- BCD, Zend Extend PHP Partnership
- PHP FAQ Highlight
- PHP Creator Didn't Set Out to Create a Langu...
- PHP Trends Revealed in Zend Study
- PHP: Best Methods for Running Scheduled Jobs
- PHP Array Functions: array_change_key_case
- PHP array_combine Function
- PHP array_chunk Function
- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: