Tutorial on how to install and configure htDig search for your web site. The Linux Information Portal includes informative tutorials and links to many Linux sites. WWW Search Engine Software. Contribute to roklein/htdig development by creating an account on GitHub. Htdig retrieves HTML documents using the HTTP protocol and gathers information from these documents which can later be used to search these documents.

Frequently Asked Questions

We’ve heard all the arguments anyway. For other causes of segmentation faults, or in other programs, getting a stack backtrace after the fault can be useful in narrowing down the problem. It also converts various PDF encodings to the Latin 1 set.

If you change the search.

This bug is fixed in version 3. If you’d like to make a feature request, you can do so through the ht: This should be fixed in versions utdig 3. In addition to installing doc2html. You can apply thdig patch by entering into the main source directory for htdig The next step is to configure ht: This is done by setting the locale attribute see question 5.

You will also need to redefine the synonyms file if you wish to use the synonyms search algorithm. Of course this will require more memory to read the larger file.


Thus far, the previous examples have assumed a Web site consisting of static HTML pages as the base for ht: The documentation for the most recent stable release is always posted at www.

The most recent exception to this was version 3. In any case, you must figure out the reason htdig keeps revisiting the same documents using different URLs, as explained in question 4. Note that this is only necessary for CGI input parameters, not for the corresponding configuration attributes in your htdig.

ht://Dig Frequently Asked Questions

For the definitive reference on this issue, please refer to section B. For help with troubleshooting, see questions 5.

Note also that some UNIX systems and libc5-based Linux systems just don’t have a working implementation of locales, so you may not be able to get locales working at all on certain systems. Development is in progress to improve cache performance. Since we all have other jobs, it make take a while before someone gets back to you.

So you are free to use ht: A large number of users insist on ignoring that last point and try to make do with just one definition, either for htdig or htsearch, or sometimes for both.

The config file is selected by the config input field in the search form. Check your search form. Yes, though you may find it easier to have one larger database and use restrict or exclude fields on searches. You will need to take a close look at the htdig -vvv or -vvvv output to see what htdig is finding, in and around the areas where the desired links are supposed to be found in your HTML code, to see if it’s actually finding them.


HtDig will provide an on-site web search capability. There’s a compile-time macro you can set in htsearch. Changing configuration variables can also help cut down on disk usage. This usually has to do with the default document size limit. This is usually an indication of a anv database.

A collection of these is available from Geoff Kuenning’s International Ispell Dictionaries pageand we’re slowly building a collection of word lists on our web site. When running from the command-line, try “-vvv” in addition to any other flags. The default value for this attribute is “index.

What you’re seeing are problems related to the Berkeley DB library. You can find out the version number of an installed ht: This is a known bug in 3.