Search this site:
Enterprise Search Blog
« NIE Newsletter

Command Line Verity K2: rcvdk

Last Updated Mar 2009

By: Mark Bennett & Miles Kehoe, - Volume 3 Number 4 - Summer 2006

When we work with new customers, one of the first tools we tend to use it the powerful K2 utility rcvdk. Sometimes, customers are amazed that in this day of graphic dashboard, Java and JSP we continue to rely on a command line retrieval client. Sure enough, over time, our clients start to rely on rcvdk as well. Why?

The rcvdk tool provides direct low level command line access to a K2 collection at the file system level. It opens the collection directly, not through brokers, servers, or ticket servers, and it offers the best way to confirm that, at the lowest possible level, a collection is valid.

You can also use rcvdk as a tool to understand how query tuning will impact your results. You can enter any valid VQL query, and rcvdk will show you the relevance scores that K2 will create for you when you move upstream into the server/broker environment. And of course, until you actually ask to view the documents, any ticket server security will not hinder you as you experiment to create the best possible results ranking algorithms for your site.

But besides a great tool to verify collection contents and test relevance, it is incredibly easy to script, and use in automated collection 'sanity checks', whether you have updated existing content or added new content to your collection. You can redirect input from a file into rcvdk; and redirect output to a file; or even better, into a Perl script to parse for success.

Just Enough rcvdk

In this article, we wanted to show you the most useful rcvdk options and commands so you can begin using this great tool in your collection management. You can find a simple example of rcvdk in a previous article, Rediscover the Poor Man's Entity Extraction in K2 in the Fall 2005 issue of Enterprise Search.

Starting rcvdk

Because it is a command line utility, fire up a telnet/ssh session to your Unix/Linux system; or start a CMD command window on Windows. In both environments, you'll need to have the K2 binary directory in your PATH; but on Unix/Linux you'll also need to have your LD_LIBRARY_PATH set to the binary directory as well.

Change to the directory that has a K2 collection, and start rcvdk:

D:\colls>rcvdk niedocs
rcvdk Verity, Inc. Version 5.5.0
Attaching to collection: niedocs
Successfully attached to 1 collection.
Type 'help' for a list of commands.
RC>

Depending on how your collection was built., you may find you need to specify the locale. Fortunately, rcvdk will tell you what locale you need to specify in the error message:

D:\colls>rcvdk niedocs
rcvdk Verity, Inc. Version 5.5.0
Attaching to collection: niedocs
>> Error E3-0036 (VDK): Collection's locale(englishx) not compatible with
Session's locale(uni)
Error attaching to collection: niedocs
Type 'help' for a list of commands.
RC> quit
D:\colls>rcvdk -locale englishx niedocs
rcvdk Verity, Inc. Version 5.5.0
Attaching to collection: niedocs
Successfully attached to 1 collection.
Type 'help' for a list of commands.
RC>

Once you've started the program, you can enter 'help' or '?' for a list of valid commands. Note that there are two modes of rcvdk - Novice (the default) and Expert. The commands you see in the help screen depend on which mode you are in. Personally, I almost always use Expert mode because I invariably need one or two of the expert commands and I find it less of a hassle to turn it on at the beginning of my sessions. Use the 'x' command to toggle Expert and Novice modes:

Novice Mode Commands
Available commands:
search s Search documents.
results r Display search results.
clusters c Display clustered search results.
view v View document.
summarize z Summarize documents.
attach a Attach to one or more collections.
detach d Detach from one or more collections.
quit q Leave application.
about Display VDK 'About' info.
help ? Display help text; 'help help' for details.
expert x Toggle expert mode on/off.
user u Set user. username[:password][:domain][:mailbox]

Expert Mode Adds the following commands:
source Set default source.
sort Set default sort.
disable Disable/enable collections.
debug Toggle internal debug flag.
fields Set fields to display.
highlight Set highlight display.
hlmode Toggle index-based/stream-based highlighting.
markup Toggle markup display on/off.
qparser Select/List K2 Query Parsers.
history Show query history.
precision Set score precision.
time t Toggle display of search execution time.
checkid Check document access with a list of VdkDocIDs.
checkkey Check document access with single or a list of VdkDocKeys.
pbs Configure passage-based summary.

There are a number of these commands you may never use; let's look at the ones you'll want to know from the start.

Searching and Viewing Results

Typically you will use rcvdk to confirm the number of results. Use the 's' command to search; and the 'r' command to review the results:

RC> s
Search update: finished (100%). Retrieved: 500(4735)/4735.
RC> s search track
Search update: finished (100%). Retrieved: 47(47)/4735.
RC> r
Retrieved: 47(47)/4735
Number SCORE VdkVgwKey
1: 0.9771 /Newidea/Thumb/NIE/pdf/Search_Tracking.pdf
2: 0.9771 /Newidea/Marketing/Web Site/NIE/pdf/Search_Tracking.pdf
3: 0.9771 /Newidea/Marketing/Web Site/Arc/2005_Dec_31/NIE/pdf/Search_T
4: 0.8169 /Newidea/Dev/niesrv126/README.txt
5: 0.7967 /Newidea/Thumb/NIE/pdf/Search_Tuning.pdf
6: 0.7967 /Newidea/Marketing/Web Site/NIE/pdf/Search_Tuning.pdf

The numbers rcvdk reports in this second search tell us the search is complete (100%); and that rcvdk has retrieved 47 documents out of 47 that meet the search criteria; and that there are 4735 documents indexed in the collection. The first null search illustrates that rcvdk never returns more than 500 documents no matter how many the collection has.

Seeing the VdkVgwKey is nice; but sometimes you want to see other field values. Enter Expert Mode and use the 'fields' command. For each field you want to see, you need to specify the K2 field name and the width of the display column:

RC> fields title 30 author 10
RC> r
Retrieved: 47(47)/4735
Number title author
1: Search Tracking Scarlet
2: Search Tracking Scarlet
3: Search Tracking Scarlet
4: Upcoming newsletter articles miles
5: Search Tuning Scarlet
6: Search Tuning Scarlet
7: Search Tuning Scarlet
8: datasheet.PDF Administra
9: Complete the behavioral pictur Theresa Md
10: PowerPoint Presentation Mark

Note that rcvdk will let you specify undefined fields, so before you decide your fields are not being populated correctly, be sure to check your spelling! The 'fields' command always displays the current result list in the specified format; so you generally do not need to perform a search again just to see different fields.

If you have more than 25 results, you can view results beyond the initial page by following the 'r' command with a numeric value to specify the (new) starting result to view.

RC> r 40
Retrieved: 500(4735)/4735
Number title author
40: Microsoft Word - PM Q3_final_e geramac
41: Fact Sheet Q3 Internet.xls RFlohr
42: FASTTaxExp.book dambrosio
43: If youÆre looking for a way to Kevin
44: SS-Price list.xls Kevin
45: Project_History.PDF Miles
46: Searchbutton Features Carl Grimm
47: PRIVACY STATEMENT? carol
48: Case Study Format Tracey
49: SEARCHBUTTON darshini

Checking Search Syntax

One thing we use rcvdk for is to test the VQL queries we want to use when we work to improve results with 'query cooking'. The program accepts any valid VQL statement, so you can test the queries you want to use to normalize your relevancy curve for top queries. If you are going to modify the display fields, be sure to use the field name Score to display the relevance for each document.

RC> fields score 5 title 30 author 10
RC> s query tuning
Search update: finished (100%). Retrieved: 25(25)/4735.
RC> r
Retrieved: 25(25)/4735
Number Score Title Author
1: 0.835 Proposal for Stanford GSB Miles Kehoe
2: 0.816 Executive Summary Miles Kehoe
3: 0.796 Microsoft Word - Response_2006

RC> s ( [0.85]titletuning, [0.90] r
Retrieved: 12(12)/4735
Number Score Title Author
1: 0.900 miles
2: 0.850 Search Tuning Scarlet
3: 0.850 This article discusses relevan Miles Kehoe
4: 0.850 This article discusses relevan Miles Kehoe

You can see from the above results that you can fine-tune your weighting until you get a reasonable relevancy distribution which generally provides for better discrimination between the results.

In Summary

You have seen a few of the features of rcvdk, a very useful tool for verifying the contents of a a K2 collection. You can go far beyond the simple capabilities we've shown here, incuding viewing with highlights, viewing dynamic summaries, and other powerful K2 capabilities - all from the comfort of your command line!

As always, if you have any questions about the script, or any other search technical tasks, feel free to mail us any time!