Multiple Department Level Search Applications

« NIE Newsletter

Serving Multiple Department Level Search Applications with a Single Enterprise Search Engine

Often departments and divisions of larger companies need a custom search application for their knowledge workers or customers and partners.

If you already have an enterprise class search engine you may not realize that you don't need to buy a new search tool to easily implement these custom solutions. Each group can have a powerful, customized search solution without the need to buy and maintain additional software. This is usually preferable to deploying separate search applications, which can become a logistical nightmare to maintain for busy IT staff.

This provides the additional benefit that search activity reporting can be more precise, and help to easily identify important department or customer specific trends. And of course the data is still accessible to other business units, since it still resides in the central search repositories.

Outline of the Process

What makes a department-specific search application?

They have their own data that needs to be searchable.
They want customized search forms specific to the searches their users need to perform
They need a results list that presents detailed, relevant data
They may need to tweak the search results in certain ways
They need control over their content
They want department-specific reporting
They may have special integration needs in order to access certain data repositories
They may have specific security requirements

Though this list may seem a bit long, no single item is particularly complex. Most enterprise class solutions provide tools to address most of these areas. We will examine them one at a time, and show how simple each one might be to accomplish. The end result of implementing this list can be quite striking and positive.

Search-Enabling Department-Specific Data

If the department has file-oriented content, then create a networked directory for their content to reside on. One idea is to create this directory on a central content server, perhaps on the same server as the search engine software, and give that group permission to manage that directory. Alternately, have the group maintain their content on their server, and then periodically replicate the files over. The Windows Resource Kit has a utility called robocopy that can be very helpful for synchronizing directories. Just leaving the data on their servers, and accessing it via UNCs, can be problematic - servers tend to go up and down or have permissions reset; ideally the raw content that is indexed will be on the same machine as the search engine collection that has been created from it.

Then point the search engine at this new content area. Non-HTML content might be best served by indexing it with a file-system based indexer. If the content is heavily linked, then a virtual web directory can be created, and then the search engine can index it via HTTP. Specify that this is a new "collection" (or "catalog" or "index", depending on your search engine's vocabulary). And give this new collection a good meaningful name. If the content is not particularly sensitive, then you might want to include it by default in company wide searches; alternately, it may need special security settings.

By giving each department their own directory tree, they maintain control over their content. You many want to proactively give all content owners a document explaining your policies for look and feel of documents, and meta-tagging guidelines.

Providing Department-Specific Search Forms

Most modern search engines are Web based. As such, they provide a CGI interface for searching, using the HTML <form> tag. This means that different HTML forms can point to the same search engine! Therefore, a department can have a special search engine just for their group. It can have special instructions and search fields, and have links to department-specific search instructions and links for assistance.

To link department specific HTML forms to that department's specific content, hidden fields can be used to limit the scope of search to just their data. Or, for more advanced users, check boxes can be used, one check box for each section of data. By default, the check boxes for that department's data can be selected, whereas the check boxes for broader company data can be un-checked by default. Another solution is to provide radio buttons to say "(*) Search our department's data only, or ( ) Search all company data".

Sometimes check boxes will not have a one-to-one correspondence with indexed collections, so some server side scripting may be needed to map user selections into a set of specific physical collections. Most search engines provide a "pre-search filter" or have some other way for a query to be "tweaked" before it is sent to the actual underlying search engine kernel.

A sample form for Tech Support, using a fictitious search engine, might look like:

<html>
<head>
<title>Tech Support Search Form</title>
</head>
<body>
<h1>Tech Support Search Form</h1>
…. Introductory text and graphics ….
<form method="POST" action="http://search.ourcorp.com/cgi/do_search">
Enter your search:<br>
<input type="text" name="query">
<p>
Where to look:<br>
<input type="checkbox" name="collection"
value="calltrack" checked>Call Tracking System<br>
<input type="checkbox" name="collection"
value="bugs" checked>Bug Tracking System<br>
<input type="checkbox" name="collection"
value="marketing">Press Releases<br>
<p>
<input type="submit" value=" Search ">
</form>
</body>
</html>

Notice that "Call Tracking" and "Bugs" are enabled by default, whereas Marketing's press releases are not included in the search by default.

More importantly, you can provide an advanced search form for Tech Support. The advanced form could have additional search fields, such as:

Previous Incident Number
Contact Name
Product
Date Range
Tech Support Rep

These fields might be very specific to Tech Support. Many of the fields might only exist in a data dump from a Call Tracking system. It would be inappropriate to put these fields on a company-wide search form.

But since Tech Support now has their own form, these fields can be added.

Of course, in order for these fields to be searchable they must be properly indexed by the search engine. Custom fields will usually require some additional configuration by the administrator in order to be searchable in this way.

Since this is such a powerful concept, I've outlined some examples of fields that other departments might like on their custom search form. Again, remember, behind the scenes we're still talking about just ONE search server.

Possible Sales Department Specific Form Fields:

Company Name
Sales Region
Sales Person
Sales Engineer
Last Purchase Date Range
Last Contact Date Range
Last Quote Date Range

Possible Engineering Specific Form Fields:

Bug Number
Submit Date Range
Last Updated
Dev Engineer
QA Engineer
Reported By
Urgency
Codebase
Priority
Status

Possible Marketing Specific Form Fields:

Release Date
Collateral Type
Source
Partner Name
Audience

The idea of mixing these types of custom fields with full text searches will explored in more detail in a future article; this type of search is also referred to as a hybrid search.

Providing a Department Specific Results List

We normally advise clients to keep results lists and search forms fairly simple, including adequate white space. Often times results lists are too cluttered and distracting. We believe the success of Google as a portal is due, in part, to their less cluttered appearance.

However, some knowledge workers at your company may do searches all day long. These power users may benefit from additional domain specific information in their results list. Since most search engines allow for the presentation of the same data in different formats at results time, different levels of detail can be shown to different classes of users.

Extending our previous example of a Tech Support engineer, if they were searching through old Tech Support calls, it might be very helpful to display the customer contact and company name in the results list. If a Tech Support person is trying to find a specific previous incident, this information will help them locate it much more quickly. In fact, many search engines even allow results lists to be resorted by the various fields that are displayed. So perhaps this Tech Support person could even sort by company name to locate the record they're after.

If that same Tech Support person was searching the bugs database, it might be nice to show the bug number, priority, bug status and last-updated-date in the result list. When a senior Tech Support person searches in a bugs database, they may be looking for a very specific set of bugs - this content-specific information can be very helpful.

Notice the duality of user experiences that is evolving. A casual corporate user can enter a search in a very simple one-box search form and see very basic results. A senior Tech Support engineer can compose a power search from an advanced search form, possibly setting a half dozen fields, and see very detailed results. Both users can be searching the same data, but with very different levels of detail.

The exact mechanics of presenting different result list formats from the same search engine vary from vendor to vendor. Some vendors allow a results template to be specified at search time, as a hidden field on the HTML form. Other vendors might allow different versions of an ASP search script to run, and each form would call a slightly different version of these ASP scripts, which would be set in the action attribute of the form tag. A third method might be to have a hidden field that simply states the source of the query. Subsequent code then keys off of this hidden value and renders the page differently using classic if/then/else logic. But however it is done, most engines are capable of this type of behavior.

Tweaking Search Results for Department Level Searches

Given the broad range of data indexed by an all encompassing enterprise wide search engine, the odds that a strictly full-text based approach applied to a one or two word query will bring the intended document at the top of the results list are slim. Don't blame you search engine vendor! To do this task "perfectly", without expert intervention, would almost require the engine to be psychic.

Below we present some strategies for tweaking your results list. Some of these are generic ideas, some of which we even covered in an earlier article (here); others are specific to departments. Describing each item in detail is beyond the scope of this article, and implementing it will be vendor specific, but this should at least give you some ideas. Also, please feel free to send us questions if you'd like clarification on any of these points.

Start by setting the default enabled and disabled data sources appropriately for each department. Users can always override them.
Make use of sorting. Most engines will allow you to sort by any arbitrary field. And you can even let the user re-sort as they view the results list. Some good candidate fields for sorting are data source and last-updated date. For departments have an advanced search form with content-specific fields, considering adding those fields to the sorting options for that group.
Use weighting adjustments, instead of sorting, to provide a softer "boost" to certain documents. So perhaps a Tech Support person's search gives some preference to previous calls and bugs when ranking the documents, but does not necessarily force those documents to the top, even if they only have a small amount of full text query evidence.
If your search engine supports it, institute a plan for identifying domain experts for various areas of content, and allow those experts to help adjust the query results.
Consider using a third party results tuning solution. As an example, NIE's Search Tuning and Reporting Toolkit can provide very specific results list tweaking on a per department basis. It offers a wide range of presentation options, and works with virtually any search engine.

Providing Department Specific Reporting

Search activity reporting is one of most valuable and accurate types of behavior based intelligence gathering available. It provides deep insight into exactly what users are looking for, and quickly identifies important business trends.

You say you don't currently have a search-activity specific reporting solution in place? Shame on you!!! You are throwing away some of the best data you could ever get your hands on. In all seriousness, if you do not have this type of system in place, please do contact us!

Assuming you do have a search activity reporting system in place, consider configuring it to track the origin of the search. In other words, which search form was the search submitted by? Being able to see what your business partners were searching for, vs. what Tech Support and Sales Engineers were search for, can help identify issues in your sales and product lifecycles.

One easy way to track the source of a search is by adding a hidden field to your various search forms. If you provide a mini-search form at the top of your results lists, don't forget to tag that search form as well.

If your reporting solution does not support this direct type of form tagging, you might be able to use the HTTP "referer" field (please note the intentional misspelling of the word "referrer", this dropping of the double-r is part of the legacy HTTP specification). Processing this information may take a little more effort; it will also be impacted by whether your HTML forms use the GET or POST methods.

A third approach might be to configure each department's search as a different "site ID". Many products allow sites to be subdivided into smaller virtual web sites, each with their own unique ID.

If your enterprise infrastructure supports consistent site wide user logins, then tracking the user can indirectly lead to tracking the source of each search; the "source" can be attributed to the department that the user belongs to, or the sales region the customer or partner is in. This type of indirect inference may be less precise, and may be more complicated to setup.

NIE's Search Tracking and Reporting Toolkit has a number of options to track and report the origin of searches.

Department Specific Security Requirements

When data is specifically search enabled, the implied assumption is that it's useful and is intended to be found! We encourage our clients to think twice before implementing draconian and often complex security measures to lockdown data in their full text database.

Do you trust your corporate firewall and user login systems? If so, might that be sufficient for securing the majority of your searchable data? If somebody can get on to your network, presumably they are a valid employee, and therefore shouldn't they be generally trusted to have access to most of the searchable data?

Of course there are some obvious exceptions. Financial and Human Resources data often require very careful access control. The extent to which data can be tightly secured to specific user logins is very vendor specific. Some vendors may not provide adequate security, in that may argue for a second search engine setup specifically for sensitive corporate data.

At the other end of the spectrum, might this data be useful to customer facing employees? Senior customer facing employees may have very broad and unpredictable data needs in the pursuit of customer related tasks. And they usually have the additional burden of needing to respond quickly. Please at least consider giving these employees the broadest access possible, unless a demonstrated pattern of behavior would indicate otherwise. A good Tech Support person may have legitimate needs to occasionally look at Sales data. A good Presales Engineer will often want to see ALL relevant customer and technical data related to their accounts. We urge you to not artificially constrain or delay these power users.

There could be potential for abuse. Managers should be leveraging your search activity reports to spot potential abuse, along with other interesting department level trends.

Department Specific Integration

Some department level data does not reside in simple files that can easily be copied to a networked file system. Many customer centric data is stored in databases, for example.

Most search vendors provide a "gateway" option to directly access, index, and search-enable database records.

With the ubiquity of web browsers, most database data is also available to users via a web browser. Since most search engines now ship with a "spider", and those spiders can index Web content by mimicking a web browser, database data can often be indexed by treating it as simple HTML content and using regular spider. This can sometimes require special settings or tweaks to the web content engine and or the spider settings, but is often a second viable option to consider.

Therefore, just because data is in a traditional SQL-style database, it is not precluded from being search enabled by your enterprise full-text search engine.

Conclusion

A single enterprise class search engine can provide different search forms and result lists to different user groups. Each group can have a powerful, custom search solution without the need to buy and maintain additional software.

A secondary benefit is that search activity reporting can be more precise, and spot important department and customer specific trends.

I realize we've covered a lot of ground in this article, but as you look back you'll notice that no single item is particularly complex. It's when all of these individual items are combined that a very powerful design strategy emerges.

As always, if you have any questions on this material, or would like any assistance with implementing it, please don't hesitate to contact us. We love this stuff!