Before ht://Dig can search your site, it has to index it.
ht://Dig retrieves HTML documents using the HTTP protocol and gathers information from these documents which can later be used to search them. In this way, it works very much like a search robot or web spider.
[Note: ht://Dig can also operate locally, indexing a site through the local file system. While this is a faster way of indexing a site, it doesn't work very well when indexing dynamic pages. Why? Well, it would indexing the source of a PHP file, not the result. And that's not what you want.]
Every time you make a change to your site, you'll want to re-index it. So it's probably a good idea to write a little shell script that indexes your site for you, and add it to your crontab. Here is rundig.sh, a script that does just that for the SummerWorks site, and emails me the details. The changes you need to make for your site should be obvious.
#! /bin/sh
if [ "$1" = "-v" ]; then
verbose="-v"
fi
# This is the directory where htdig lives
BASEDIR=/usr/local/htdig
# This is the db dir
DBDIR=$BASEDIR/db/sw98
# This is the directory htdig will use for temporary sort files
TMPDIR=/tmp
export TMPDIR
# This is the name of a temporary report file
REPORT=$TMPDIR/htdig.sw98
# This is who gets the report
REPORT_DEST="you@your-email-address.com"
export REPORT_DEST
# This is the subject line of the report
SUBJECT="ht://Dig Report for SW98"
# This is the name of the conf file to use
CONF=sw98.conf
# This is the PATH used by this script. Change it if you have problems
# with not finding wc or grep.
PATH=/usr/local/bin:/usr/bin:/bin
##### Dig phase
STARTTIME=`date`
echo Start time: $STARTTIME
echo rundig: Start time: $STARTTIME > $REPORT
$BASEDIR/bin/htdig $verbose -s -a -c $BASEDIR/conf/$CONF >> $REPORT
TIME=`date`
echo Done Digging: $TIME
echo rundig: Done Digging: $TIME >> $REPORT
##### Merge Phase
$BASEDIR/bin/htmerge $verbose -s -a -c $BASEDIR/conf/$CONF >> $REPORT
TIME=`date`
echo Done Merging: $TIME
echo rundig: Done Merging: $TIME >> $REPORT
##### Cleanup Phase
# To enable htnotify, uncomment the following line
# $BASEDIR/bin/htnotify $verbose >>$REPORT
# To enable the soundex or endings search, uncomment the following line
$BASEDIR/bin/htfuzzy $verbose -c $BASEDIR/conf/$CONF endings
# Move the work files
mv $DBDIR/db.wordlist.work $DBDIR/db.wordlist
mv $DBDIR/db.docdb.work $DBDIR/db.docdb
mv $DBDIR/db.docs.index.work $DBDIR/db.docs.index
mv $DBDIR/db.words.db.work $DBDIR/db.words.db
END=`date`
echo End time: $END
echo rundig: End time: $END >> $REPORT
echo
# Grab the important statistics from the report file
# All lines begin with htdig: or htmerge: or rundig:
fgrep "htdig:" $REPORT
echo
fgrep "htmerge:" $REPORT
echo
fgrep "rundig:" $REPORT
echo
WC=`wc -l $REPORT`
echo Total lines in $REPORT: $WC
# Send out the report ...
mail -s "$SUBJECT - $STARTTIME" $REPORT_DEST < $REPORT
# ... and clean up
rm $REPORT
Run this from the command line with the -v switch and you can watch as your site is indexed! You'll need to run this as root (or the same user you installed ht://Dig as) so that it can create the necessary files in /usr/local/htdig/db.