Search this site:
Enterprise Search Blog
« NIE Newsletter

Checking K2 Collection Health using WSH

Last Updated Mar 2009

By: Mark Bennett & Miles Kehoe. - Volume 3 Number 3 - Spring 2006

We're always looking for ways to make life easier for our customers who are often responsible for managing and maintaining large Verity installations with command line utilities and with web management interfaces that, while easier to use, don't always provide the data that IT staffs require.

In the past we have published Windows Shell Host (WSH) scripts to perform simple K2 search and to list K2 collections, but the Operations Staff at CaseShare Systems has built an entire set of K2 management scripts based on Windows Shell Host. They've allowed us to write about one of their scripts that reports on K2 collection health, and we expect to write more about more of their script tools in the coming months.

CaseShare has done more with K2 than many of our customers, and are successfully indexing and searching tens of millions of documents as a key component of their business. As a long time Verity customer, they decided to develop these administrative scripts, first used with K2 Version 4.5, for a few main reasons:

  • Growth in their customer based triggered:
    • A shrinking maintenance window to the point that they could not easily schedule traditional backup systems
    • Collections were growing far beyond original expectations
  • CaseShare needed to automate nearly all of the Verity routine administration so they could focus their efforts on running the facility

An Introduction to the Code

We took a script CaseShare calls verity-health and customized it to work in any arbitrary K2 installation. To do this, we effectively modified some of the subroutines and functions that were written to fit into the overall CaseShare system. They have some of the best developers we've worked with, so be assured that anything that looks odd is probably from our attempts to sanitize the system.

The report that the script generates is listed in Figure 1. You can see that the report contains provides useful information that is not easily available in one place using the standard Verity utilities.

Working ctl_ender.txt
+ Setting local drive d:\colls
+ Reading in control file d:\colls\tools\ctl_ender.txt
+ Parsing configuration from d:\colls\tools\ctl_ender.txt
+ Working in d:\colls\niedocs
0.23 GB
Server ender
2 ddd files
0 mrg files
2 did files
4 total files
calling mkvdk
+ Reading MKVDK Results
Last Squeeze Date: 0000000000
Number of Documents: 4735

Figure 1: Sample Report

The report above combines information extracted directly from the collection directory, and from the standard mkvdk utility using the little known -about flag. Figure 2 shows the invocation and output which are programmatically performed in the CaseShare script:

mkvdk.exe -collection niedocs -about -noindex -nooptimize 

mkvdk.exe - Verity, Inc. Version 5.5.0 (_nti40, Jan 21 2005)
Collection about resources:
Creation Date: 15-Aug-2005 08:59:47 pm
Modification Date: 11-May-2006 12:48:14 am
Last Purge Date: 0000000000
Last Squeeze Date: 0000000000
Number of Documents: 4735
Collection Creator: NEW IDEA ENGINEERING
Collection Name: niedocs
Collection Type:
Collection Description:
Locale Name: englishx
Charset: 1252
Country: US
Language: en
Dialect: US
Supplier:
Major Version: 1
Minor Version: 0
Copyright Notice: InXight linguistX wrapper © 1997 Verity, Inc. All rights reserved.
mkvdk.exe done

Figure 2: MKVDK 'About' Output

This script, like all of the tools CaseShare uses, use control files to define key collection and server parameters. The production use several additional fields, but for the purpose of this example, we are using simplified files as shown in Figure 3.

ender;niedocs
ender;doc_master

Figure 3: Sample Control File ctl_ender.txt

The general format here is servername;collection. A single control file can contain one ore more collections; the verity-health script as written supports a maximum of three control files, since CaseShare uses only three primary servers in this application. By convention, the script looks for and processes any file in the primary working directory that starts with ctl_.

The Real Work

Enough of the preliminaries: the actual code is provided here.

The script uses an external file, mkvdk-about.cmd, to actually run mkvdk. You could probably call the program directly from within verity-health, but it was easier to pass parameters and let the operating system invoke the program. mkvdk-about.cmd is listed in Figure 4.

d:\verity\k2\_nti40\bin\mkvdk.exe -collection %1 -about -noindex -nooptimize > %2

Figure 5: mkvdk-about.cmd Script

The script, called from verity-health, accepts the collection name and returns the logfile out.

Making it Work

To use the above tool, create a directory for the scripts and control files. Then:

  • Create a control file
  • Edit mkvdk-about.cmd to point to your Verity mkvdk executable
  • Edit the source code to map the correct collection drive/directory to the server names (Note: See the Mapped drive Assignment section near the top of the code listing ) define the correct location for sMKVDKAbout for your system
  • Run the scrip and pass in the directory where the control files are located.

On our test system, we are using D:\colls\tools to store the following files:

ctl_bean.txt
ctl_ender.txt
mkvdk-about.cmd
verity-health.vbs

It will also be where the logfiles directory will be created.

To run the script, enter the following command:

cscript verity-health.vbs d:\colls\tools   /nologo

You should see an output like that in Figure 1 above. You can append the /NOLOGO command to suppress the Microsoft header information.

Note: You may see a 'Malicious Script Alert' depending on the security settings for your system. To be safe, confirm that the operation the script is performing is one you are comfortable with; and allow the activity once, for the entire script, or disallow it and confirm the operation is benign.

The logfile directory has a more complete dump of data than the script as shown actually uses. Browsing those logfiles may give you additional ideas for content to include in your report.

Summary

Windows Shell Script is a tool many It professionals use for administrative scripting in their corporate environments. The scripts listed here, thank to CaseShare, may provide some information you want to review regularly in your environment.

As always, if you have any questions about the script, or any other search technical tasks, feel free to mail us any time!