Administration
  Home arrow Administration arrow Page 2 - Site Search with HTDIG
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
ADMINISTRATION

Site Search with HTDIG
By: icarus, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 20
    2004-04-12


    Table of Contents:
  • Site Search with HTDIG
  • Digging Deep
  • Source Control
  • Script Barf
  • Variable Control
  • A Well-Formed Plan
  • What You See
  • Custom Job
  • Out With The Old
  • Caveat Emptor
  • Ending The Dig

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Site Search with HTDIG - Digging Deep
    ( Page 2 of 11 )

    In the words of its official website ht://Dig is "a complete world wide web indexing and searching system for a domain or intranet...meant to cover the search needs for a single company, campus, or even a particular sub section of a web site." ht://Dig was originally developed at San Diego State University, and is today very popular amongst developers looking to quickly add search engine capabilities to a Web site.

    ht://Dig works by traversing a Web site and creating a database of all the unique words it finds as it follows hyperlinks from one page to another. This database, together with information on the URL associated with each document, is created every time you request a re-indexing of the site, and is merged with the results of previous index runs to create the foundation for the search engine.

    Every time a search is executed, this database is scanned for matches to the search string and a list of results retrieved. The matches are further ranked according to an internal scoring system to filter down to the most relevant, and the results returned to the user, together with links to the pages on which the matches occurred. The process, though somewhat complicated, is nonetheless extremely fast and -- thanks to intelligent search algorithms and scoring systems -- also very accurate.

    ht://Dig also supports Boolean searches, which make it possible to selectively widen or close a search; fuzzy searching, in which the search is automatically expanded to include similar-sounding words, synonyms and plurals; depth-limited searching, in which only documents which are at a particular depth from the tree root are searched; and META-tag indexing for more accurate search results. Both search and result pages can be extensively customized in the ht://Dig system, and -- since the source code is freely available under the GPL --developers can even modify and enhance the application to their own specific needs.

    Now that you have the background - let's get to work, by installing and configuring ht://Dig.



     
     
    >>> More Administration Articles          >>> More By icarus, (c) Melonfire
     

       

    ADMINISTRATION ARTICLES

    - Network Booting via PXE: the Basics
    - Scalix: Linux Administrator`s Guide
    - Network Administration with FreeBSD 7
    - Components of an Information Architecture
    - The Anatomy of an Information Architecture
    - Configuring Load-Balanced Clusters
    - Load-Balanced Clusters
    - UNIX Time Format Demystified
    - Making Changes in the CVS
    - Building Your First CVS Repository
    - CVS Quickstart Guide
    - Authorizing Users in Samba
    - Handling User Accounts in Samba
    - Authentication in Samba
    - Accounts, Authentication, and Authorization





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 5 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek