An old support adage is When you're sure everything is right, and it still doesn't work, something you are sure of is wrong.
Sometimes you'll find that a new program you are writing using the VSearch object in K2 is just not working the way you expect. Either it's not finding your Java classes, or the server isn't responding, and you'd really like to know what parameters K2 is using when it initiates the VSearch object.
The Verity K2 Dashboard is a great way to manage your K2 installation, but sometimes you just need to know one small detail and you don't need to go to the length of starting the dashboard, giving the password, and poking through menus to find the nugget you needed to know.
This month's question comes form a customer who switched from a Unix platform to Windows 2000 for a K2 5.5 Verity installation and ran into an unexpected problem: the scripts that had worked fine with Vspider on Unix no longer worked on Windows!
We search customer support problem reports with Verity K2. When we recently upgraded to K2 6.1 we could no longer find some of the status messages in the call logs. These messages are often hexadecimal numbers of the form 0xFF0305. What's up?
The rcvdk and rck2 utilities are great tools.
rcvdk is the older tool, and is used to directly open collections. It lets you search collections and browse results to confirm that collections are being built the way you think they are. If you are writing code to programmatically access Verity collections, rcvdk lets you cross-check the results you see with your code so you can insure your code is working right. You run rcvdk from the Verity ~bin directory on the machine where the collections reside.
Do you ever wonder what data is actually stored in your Verity k2 collections? Oh, sure, you probably have a list of the documents which you submitted for indexing; and if you're like a few of the customers we work with, you even analyze your index logs to determine which documents did not successfully index. And you may even know why they didn't index. But seriously: do you know exactly what content you have in all of the fields of your collections?
What is query cooking and why would you want to use it?
Simply stated, query cooking is rewriting a query the user entered before running it. There are many reasons why you may want to do this. You could modify the query so that hits in more relevant fields are displayed first in the returned results. Security could be added to the query. You may want to optimize the query so it runs faster and uses fewer resources. You could also tune the query so that the results are what the user really wanted.
Verity has released its Extractor product to provide some really interesting capabilities such as identifying and extracting content into specific fields and zones. Some old timers with Verity remember the 'TDE' capability that provided similar although less sophisticated capabilities years ago. For those of you who have forgotten about TDE, and for those of you who have started using Verity in the last few years and have not used it, you'll be glad to hear the TDE capability is still included in K2; the Collection Reference Guide still documents the capabilities even in K2 6.0.
The bad news is that none of the sample style sets use TDE at all. The good news is that if you want to try a "poor man's entity extraction", this article will show you how.
If you’ve been monitoring the search activity on your web site, you’ve probably noticed that there are two kinds of searcher: those that use one or two words to locate a particular document; and those that use very long queries to try to home in on exactly the document they want. Unfortunately, neither type of query is very useful in really locating answers.
Luckily, most enterprise search technology allows you to pre-process user queries before they get to your search engine, and by intelligently processing user queries, you can greatly enhance the changes that the right document will show up near the top of the result list.
The Verity K2 engine supports a user-defined thesaurus capability that allows users to define synonyms for specific terms. Once enabled, a search that uses the operator will expand to include documents that include the defined synonyms defined in the custom thesaurus.
We removed the Windows domain user name that had been managing the Verity K2 console and now we cannot access the console. How can I reset the password without reinstalling K2?
In previous versions of K2, we could rebuild a collection from scratch simply by taking the collection off-line, running mkvdk with the "-purge" option, and bringing the collection back online again.
We're now using the K2 Spider; and when we use our old scripts, the collection does come up empty; but the K2 spider seems to have its own database of indexed documents. When we restart the spidering job, no new documents are indexed.
How can we reset the spider to tell it to spider everything again?
Using a custom thesaurus in Verity K2 is a powerful way to provide relevant results for your users, especially when you have a specialized vocabulary in your organization that your end users may not know well. The problem is that you either need to count on your users knowing enough to use the operator on their queries; or you need to do your own query tweaking to expand your user queries. The former is not likely, and the latter adds to your query processing.
Our web content includes links to other parts of our company as well as to external partner web sites. To get full control over what gets indexed for search on our site, we have decided to crawl our file system rather than try to set spider rules and depth that varies by which site we index. How can we assign a display URL in K2 so clicking on a link jumps to the right page, even though the Verity K2 vgkvgwkey is a fully qualified file name?
When we work with new customers, one of the first tools we tend to use it the powerful K2 utility rcvdk. Sometimes, customers are amazed that in this day of graphic dashboard, Java and JSP we continue to rely on a command line retrieval client. Sure enough, over time, our clients start to rely on rcvdk as well. Why?
The rcvdk tool provides direct low level command line access to a K2 collection at the file system level. It opens the collection directly, not through brokers, servers, or ticket servers, and it offers the best way to confirm that, at the lowest possible level, a collection is valid.
he Verity K2 API supports both ASP and JSP scripting because the low-level Java calls are wrapped as COM objects. This lets developers access the various objects - VSearch being among the most useful - to perform searches and display results...
NIE Enterprise Search: Issue 3 - July, 2003 If your company owns Verity K2 enterprise search, one of the features for high availability and load-balancing situations is called collection mirroring, which enables identical collections to be accessed on two Verity K2 servers simultaneously. To mirror collections you need a minimum of two K2 servers in your search engine architecture. The collections must also be named identically on both servers in order to enable mirroring.
We're always looking for ways to make life easier for our customers who are often responsible for managing and maintaining large Verity installations with command line utilities and with web management interfaces that, while easier to use, don't always provide the data that IT staffs require.
It's good to use the browser based Dashboard when you can, but if you are a hard core command line person, help is on the way.
This month a reader wanted to know if it was possible to use Java Ant, the Apache Project open source build tool, to manage Verity K2 collections. A full production-quality Java Ant script is beyond the scope of Dr. Search, although perhaps in the near future one of his associates will provide a script that you would be proud to put into production. However, in the meantime this should serve as a starting point for many of you.
This month Clinton gives us a Quick Start for building a K2 collection with the ODBC driver and Vspider. K2 is currently sold by Autonomy, though some people still refer to it as Verity K2.
Improving results list relevancy can greatly reduce visitor frustration and improve overall visitor retention. A well-tuned search engine can also reduce calls to customer service and support, as visitors are able to find answers to questions on their own.