Locator: NIE Home / Publications / Enterprise Search Newsletter / Volume 2 Number 4 / Ask Doctor Search
Ask Doctor Search: Using a Thesaurus as a Knowledge Base
Volume 2 Number 4 - January 2005
Using a custom thesaurus in Verity K2 is a powerful way to provide relevant results for your users, especially when you have a specialized vocabulary in your organization that your end users may not know well. The problem is that you either need to count on your users knowing enough to use the <<THESAURUS> operator on their queries; or you need to do your own query tweaking to expand your user queries. The former is not likely, and the latter adds to your query processing.
Dr. Search points out that there are other options, including topic sets. But even these need an operator at query-time. But using a little known trick from the old days, Dr. Search will show you how to use a thesaurus in every query, one that will be compatible with term highlighting but that will never make you post-process user queries to work. The trick? A knowledge base can reference a thesaurus,
Let's say that your control file, listed in Figure 1, is in D:\Data\kb:
$control:1
synonyms:
{
list: "founders,abe,phil,john,michael,dave,cliff"
list: "ceo,anthony,philippe,mike"
}
Figure 1: people.ctl
When you compile the file, run it in the same directory:
D:\Data\kb> mksyd -f people.ctl -syd people.sydYou'll find you now have a compiled thesaurus file D:\Data\kb\people.syd. Normally, you would replace the vdk30.syd with this file; but not if you want to use the thesaurus terms in every query!
$control:1
kbases:
{
kb: "kb1"
/kb-path = "D:\\Data\\kb\\people.syd"
}
Figure 2: kb1.kbm
It is critical that you use your path separators very carefully. On both Windows and Unix
platforms, you can use a single forward slash "/" character. However, if you use the backslash
separator "\" standard on Windows, be sure to use two of them as illustrated
in Figure 2! Using a single "\" will mean your knowledge base is not used.
Start the K2 dashboard and sign in as an administrator. If you want to apply the thesaurus to all searches on a given K2 server instance, select that server name at the main dashboard menu. From the Action pull down, select "Expert Settings", and locate the Knowledge Base Path entry. Enter the fully qualified name of the knowledge base control file we created earlier - D:\Data\kb1.kbm. Click on Modify; and return to the server instance and do full restart.
Now you're ready to test! In our example here, any document that has the words Abe, Phil, Michael, Dave or Cliff will return when you use the query "founder" . If you are displaying documents with highlighting, the terms will displayed just as if you entered a search for the synonym; and you didn't need to use the awkward <THESAURUS> operator.