PHP
  Home arrow PHP arrow Page 3 - Using Relevance Rankings for Full Text and Boolean Searches with MySQL
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PHP

Using Relevance Rankings for Full Text and Boolean Searches with MySQL
By: Alejandro Gervasio
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 7
    2007-06-13


    Table of Contents:
  • Using Relevance Rankings for Full Text and Boolean Searches with MySQL
  • Developing a basic MySQL-driven search engine
  • Determining the 50 percent threshold
  • Building an additional example

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Using Relevance Rankings for Full Text and Boolean Searches with MySQL - Determining the 50 percent threshold
    ( Page 3 of 4 )

    As I stated in the section that you just read, it's important to know how MySQL handles different relevance rankings. This leads me straight into introducing the concept of a feature called the 50% threshold.

    Basically, this means that if a search word is present in more than 50 percent (hence its name) of the table rows searched, then these rows simply will be discarded from the corresponding results.

    So, if you consider together the rows removal process performed via the aforementioned 50% threshold, in addition to the elimination of noisy words, then you'll have a clear idea of how MySQL tries to discard from the very beginning search terms with low relevance, in this way accelerating noticeably the execution of search queries.

    Now that you have learned a bit of the theory surrounding the 50% threshold, let me show you a concrete example that demonstrates how a certain search term that is present in more than 50% of the existing database rows is automatically discarded by MySQL from the corresponding results.

    To illustrate how this database row removal process works, I'm going to use the same source files that were shown in the previous section, so this specific example can be more easily grasped.

    That being said, here are the source files in question:

    (definition of form.htm file)

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-
    8859-1" />
    <title>Testing the MySQL 50% threshold</title>
    <style type="text/css">
    body{
      
    padding: 0;
      
    margin: 0;
      
    background: #fff;
    }

    h1{
      
    font: bold 16px Arial, Helvetica, sans-serif;
      
    color: #000;
      
    text-align: center;
    }

    p{
      
    font: bold 11px Tahoma, Arial, Helvetica, sans-serif;
      
    color: #000;
    }

    #formcontainer{
      
    width: 40%;
      
    padding: 10px;
      
    margin-left: auto;
      
    margin-right: auto;
      
    background: #6cf;
    }
    </style>
    </head>
    <body>
     
    <h1>Testing the MySQL 50% threshold</h1>
      
    <div id="formcontainer">
       
    <form action="search.php" method="get">
         
    <p>Enter search term here : <input type="text"
    name="searchterm" title="Enter search term here" /><input
    type="submit" name="search" value="Search Now!" /></p>
       
    </form>
     
    </div>
    </body>
    </html>

    (definition of search.php file)

    <?php
    // define 'MySQL' class
    class MySQL{
      
    private $conId;
      
    private $host;
      
    private $user;
      
    private $password;
      
    private $database;
      
    private $result;
      
    const OPTIONS=4;
      
    public function __construct($options=array()){
        
    if(count($options)!=self::OPTIONS){
          
    throw new Exception('Invalid number of connection
    parameters');
        
    }
        
    foreach($options as $parameter=>$value){
          
    if(!$value){
            
    throw new Exception('Invalid parameter '.$parameter);
           
    }
          
    $this->{$parameter}=$value;
         
    }
        
    $this->connectDB();
       
    }
      
    // connect to MySQL
      
    private function connectDB(){
        
    if(!$this->conId=mysql_connect($this->host,$this-
    >user,$this->password)){
          
    throw new Exception('Error connecting to the server');
         
    }
        
    if(!mysql_select_db($this->database,$this->conId)){
          
    throw new Exception('Error selecting database');
        
    }
      
    }
      
    // run query
      
    public function query($query){
        
    if(!$this->result=mysql_query($query,$this->conId)){
          
    throw new Exception('Error performing query '.$query);
        
    }
        
    return new Result($this,$this->result);
      
    }
      
    public function escapeString($value){
        
    return mysql_escape_string($value);
      
    }
    }

    // define 'Result' class
    class Result {
      
    private $mysql;
      
    private $result;
      
    public function __construct($mysql,$result){
        
    $this->mysql=$mysql;
        
    $this->result=$result;
      
    }
      
    // fetch row
      
    public function fetchRow(){
        
    return mysql_fetch_assoc($this->result);
      
    }
      
    // count rows
      
    public function countRows(){
        
    if(!$rows=mysql_num_rows($this->result)){
          
    return false;
        
    }
        
    return $rows;
      
    }
      
    // count affected rows
      
    public function countAffectedRows(){
        
    if(!$rows=mysql_affected_rows($this->mysql->conId)){
          
    throw new Exception('Error counting affected rows');
        
    }
        
    return $rows;
      
    }
      
    // get ID form last-inserted row
      
    public function getInsertID(){
        
    if(!$id=mysql_insert_id($this->mysql->conId)){
          
    throw new Exception('Error getting ID');
        
    }
        
    return $id;
      
    }
      
    // seek row
      
    public function seekRow($row=0){
        
    if(!is_int($row)||$row<0){
          
    throw new Exception('Invalid result set offset');
        
    }
        
    if(!mysql_data_seek($this->result,$row)){
          
    throw new Exception('Error seeking data');
        
    }
      
    }
    }

    try{
       // connect to MySQL
       $db=new MySQL(array('host'=>'host','user'=>'user','password'=>'password',
    'database'=>'database'));
      
    $searchterm=$db->escapeString($_GET['searchterm']);
      
    $result=$db->query("SELECT firstname, MATCH
    (firstname,lastname,comments) AGAINST('$searchterm') AS
    relevance FROM users");
      
    if(!$result->countRows()){
        
    echo 'No results were found.';
      
    }
      
    else{
        
    echo '<h2>Users returned are the following:</h2>';
        
    while($row=$result->fetchRow()){
          
    echo '<p>Name: '.$row['firstname'].' Relevance: '.$row
    ['relevance'].'</p>';
        
    }
      
    }
    }

    catch(Exception $e){
       echo $e->getMessage();
      
    exit();
    }
    ?>

    So far, so good. Since the definition of the above source files should be very familiar to you, pay strong attention to the results outputted by the previous search query if the search term "mysql" is entered in the corresponding web form.

    // PHP file displays the following output
    /*
    Users returned are the following:

    Name: Alejandro Relevance: 0

    Name: John Relevance: 0

    Name: Susan Relevance: 0

    Name: Julie Relevance: 0
    */

    As you can see, MySQL has quickly removed the previous search term from the respective database results, since it was present in two table rows. Now, are you starting to grasp the logic behind the 50% threshold? I bet you are!

    All right, at this point I think you understand how MySQL removes diverse search terms based on the 50% threshold algorithm. Thus, it's time to move on and read the last section of this tutorial, where I'm going to set up an additional example to further clarify the concept that surrounds the implementation of the aforementioned 50% threshold.

    To see how this final example will be built, click on the link below and keep reading.



     
     
    >>> More PHP Articles          >>> More By Alejandro Gervasio
     

       

    PHP ARTICLES

    - Building Dynamic Queries with Chainable Meth...
    - PHP Encryption and Decryption Methods
    - Building a MySQL Abstraction Class with Meth...
    - Completing a Sample String Processor with Me...
    - Mastering WHILE Loops for PHP and MySQL
    - Method Chaining: Adding More Methods to the ...
    - Method Chaining in PHP 5
    - The Role of Interfaces in Applying the Depen...
    - Dependency Injection: Using a Setter Method ...
    - Using a Model Class with the Dependency Inje...
    - Injecting Objects Using Setter Methods with ...
    - Injecting Objects by Constructor with the De...
    - The Dependency Injection Design Pattern in P...
    - Performing Inferential Statistical Analysis ...
    - Performing Descriptive Statistical Analysis ...





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    Stay green...Green IT