Friday, July 30, 2010
Still using Verity K2? - You Are Not Alone!
Minimize

Sure there are newer engines on the market.  When Autonomy bought Verity it was assumed most K2 customers would upgrade to the IDOL platform, and many have.  Autonomy even offers a hybrid version of K2 which includes the IDOL engine underneath.

But many sites are still using good old K2.  They have a perpetual license and the system is generally working fine.  With tight budgets, some are not so keen to change engines, not only because of license / upgrade fees, but also the time and expense re-implementing a system that is already working!

The good news is that we understand!  We're vendor neutral and work with virtually every engine on the market today.  We can help you with K2 now, and any of the other engines you currently have.  And should you ever decide to upgrade, we could help with that as well.

NIE K2 Advantages:

  • Three of the senior NIE engineers used to work at Verity during the K2 days, in the 1990s and early 2000s.
  • We've helped dozens of clients with system upgrades, relevancy tuning and sticky K2 problems.
  • We have utilities and software libraries that read and write K2 bulk insert files (what we call "bif tweaking")
  • We've got wrappers for command line tools like mkvdk, vspider, rcvdk and rck2
  • We can also wrap rcadmin commands to automate deployment and check the results.
  • Knowledge of system tools including extract, didump and testqp.
  • In depth understanding of VQL (the Verity Query Language)
  • UltraSpider / K2 integration and customization of Ultraseek and patches.py
  • Deep understanding of style files and their options.
  • We've written Topic OTL files (Verity's taxonomy format) from scratch, and even written conversion utilities.
  • We know K2 !!!

If you're still using K2 and need help give us a call!

Archive of our Verity K2 / Autonomy K2 Articles
Minimize
An old support adage is When you're sure everything is right, and it still doesn't work, something you are sure of is wrong. Sometimes you'll find that a new program you are writing using the VSearch object in K2 is just not working the way you expect. Either it's not finding your Java classes, or the server isn't responding, and you'd really like to know what parameters K2 is using when it initiates the VSearch object.
The Verity K2 Dashboard is a great way to manage your K2 installation, but sometimes you just need to know one small detail and you don't need to go to the length of starting the dashboard, giving the password, and poking through menus to find the nugget you needed to know.
This month's question comes form a customer who switched from a Unix platform to Windows 2000 for a K2 5.5 Verity installation and ran into an unexpected problem: the scripts that had worked fine with Vspider on Unix no longer worked on Windows!
We search customer support problem reports with Verity K2. When we recently upgraded to K2 6.1 we could no longer find some of the status messages in the call logs. These messages are often hexadecimal numbers of the form 0xFF0305. What's up?
The rcvdk and rck2 utilities are great tools. rcvdk is the older tool, and is used to directly open collections. It lets you search collections and browse results to confirm that collections are being built the way you think they are. If you are writing code to programmatically access Verity collections, rcvdk lets you cross-check the results you see with your code so you can insure your code is working right. You run rcvdk from the Verity ~bin directory on the machine where the collections reside.
Do you ever wonder what data is actually stored in your Verity k2 collections? Oh, sure, you probably have a list of the documents which you submitted for indexing; and if you're like a few of the customers we work with, you even analyze your index logs to determine which documents did not successfully index. And you may even know why they didn't index. But seriously: do you know exactly what content you have in all of the fields of your collections?
What is query cooking and why would you want to use it? Simply stated, query cooking is rewriting a query the user entered before running it. There are many reasons why you may want to do this. You could modify the query so that hits in more relevant fields are displayed first in the returned results. Security could be added to the query. You may want to optimize the query so it runs faster and uses fewer resources. You could also tune the query so that the results are what the user really wanted.
Verity has released its Extractor product to provide some really interesting capabilities such as identifying and extracting content into specific fields and zones. Some old timers with Verity remember the 'TDE' capability that provided similar although less sophisticated capabilities years ago. For those of you who have forgotten about TDE, and for those of you who have started using Verity in the last few years and have not used it, you'll be glad to hear the TDE capability is still included in K2; the Collection Reference Guide still documents the capabilities even in K2 6.0. The bad news is that none of the sample style sets use TDE at all. The good news is that if you want to try a "poor man's entity extraction", this article will show you how.
If you’ve been monitoring the search activity on your web site, you’ve probably noticed that there are two kinds of searcher: those that use one or two words to locate a particular document; and those that use very long queries to try to home in on exactly the document they want. Unfortunately, neither type of query is very useful in really locating answers. Luckily, most enterprise search technology allows you to pre-process user queries before they get to your search engine, and by intelligently processing user queries, you can greatly enhance the changes that the right document will show up near the top of the result list.
The Verity K2 engine supports a user-defined thesaurus capability that allows users to define synonyms for specific terms. Once enabled, a search that uses the operator will expand to include documents that include the defined synonyms defined in the custom thesaurus.
We removed the Windows domain user name that had been managing the Verity K2 console and now we cannot access the console. How can I reset the password without reinstalling K2?
In previous versions of K2, we could rebuild a collection from scratch simply by taking the collection off-line, running mkvdk with the "-purge" option, and bringing the collection back online again. We're now using the K2 Spider; and when we use our old scripts, the collection does come up empty; but the K2 spider seems to have its own database of indexed documents. When we restart the spidering job, no new documents are indexed. How can we reset the spider to tell it to spider everything again?
Using a custom thesaurus in Verity K2 is a powerful way to provide relevant results for your users, especially when you have a specialized vocabulary in your organization that your end users may not know well. The problem is that you either need to count on your users knowing enough to use the operator on their queries; or you need to do your own query tweaking to expand your user queries. The former is not likely, and the latter adds to your query processing.
Our web content includes links to other parts of our company as well as to external partner web sites. To get full control over what gets indexed for search on our site, we have decided to crawl our file system rather than try to set spider rules and depth that varies by which site we index. How can we assign a display URL in K2 so clicking on a link jumps to the right page, even though the Verity K2 vgkvgwkey is a fully qualified file name?
When we work with new customers, one of the first tools we tend to use it the powerful K2 utility rcvdk. Sometimes, customers are amazed that in this day of graphic dashboard, Java and JSP we continue to rely on a command line retrieval client. Sure enough, over time, our clients start to rely on rcvdk as well. Why? The rcvdk tool provides direct low level command line access to a K2 collection at the file system level. It opens the collection directly, not through brokers, servers, or ticket servers, and it offers the best way to confirm that, at the lowest possible level, a collection is valid.
he Verity K2 API supports both ASP and JSP scripting because the low-level Java calls are wrapped as COM objects. This lets developers access the various objects - VSearch being among the most useful - to perform searches and display results...
NIE Enterprise Search: Issue 3 - July, 2003 If your company owns Verity K2 enterprise search, one of the features for high availability and load-balancing situations is called collection mirroring, which enables identical collections to be accessed on two Verity K2 servers simultaneously. To mirror collections you need a minimum of two K2 servers in your search engine architecture. The collections must also be named identically on both servers in order to enable mirroring.
We're always looking for ways to make life easier for our customers who are often responsible for managing and maintaining large Verity installations with command line utilities and with web management interfaces that, while easier to use, don't always provide the data that IT staffs require.
It's good to use the browser based Dashboard when you can, but if you are a hard core command line person, help is on the way.
This month a reader wanted to know if it was possible to use Java Ant, the Apache Project open source build tool, to manage Verity K2 collections. A full production-quality Java Ant script is beyond the scope of Dr. Search, although perhaps in the near future one of his associates will provide a script that you would be proud to put into production. However, in the meantime this should serve as a starting point for many of you.
This month Clinton gives us a Quick Start for building a K2 collection with the ODBC driver and Vspider. K2 is currently sold by Autonomy, though some people still refer to it as Verity K2.
Improving results list relevancy can greatly reduce visitor frustration and improve overall visitor retention. A well-tuned search engine can also reduce calls to customer service and support, as visitors are able to find answers to questions on their own.
Copyright 1996-2009 by New Idea Engineering, Inc.
Privacy Statement Terms Of Use