Search this site:
Enterprise Search Blog
« NIE Newsletter

Log Your Raw Searches

Last Updated Dec 2008

By: Mark Bennett, New Idea Engineering, Inc., Volume 2 - Number 8 - Fall 2004

Mind Reading 101: The Search Analytics Corner

We've chosen a really easy topic for this inaugural installment of our new column. We'll start with the assumption that you are logging your search engine activity – you ARE logging it aren't you!?

Seriously, if you are not current capturing your search activity into a separate log, apart from your normal 'web log' please start doing so immediately. Trust us on this one; once you start doing this and running reports, you'll wonder how you ever got by without it. And NO, 'grep'ing' through your regular web logs won't cut it; it doesn't give you all the statistics you'll need.

If you need help with this, have any questions about your particular search anaytics problems, please drop us an email.

OK, as we were saying, many companies have started the all important practice of logging search activity; that's a good starting point.

Just remember that logging your activity isn't enough - you need to look at them and understand what they are telling you!.

But we've noticed a trend - many companies are storing the modified or 'cooked' query in their log files, vs. what the user actually typed in; they are NOT logging what the user typed in, but instead what was eventually sent to the search engine; they are losing this data!!!

Why is this important check and fix?

The number one benefit to search analytics is the ability to peer directly into the mind of your web site visitor - your employees, your prospects,your customers. Instead of guessing what their intent was with those old click tracking methods and "how many milliseconds did they spend on this page" analytics, you have much more accurate idea about what they were thinking, because they actually typed it for you.

But when you overwrite that with some 'transformed' query, you've obscured what they were looking for with extra gibberish added by your query transform / query cooking process.You've lost, or at least diluted, a priceless artifact.

Vocabulary: 'Cooked Query'

What is a 'cooked' query you ask? Briefly, most search applications change the search that the user typed in before it is sent to the actual search engine.Web sites may add additional words or terms, or filter out punctuation, or add additional parameters to adjust the relevancy. There are certainly valid reasons to do so: we've even written about this several times here in the newsletter. We're not saying to not cook the queries, but we're saying that you should also keep a copy of what was originally typed in.

How does this data get lost?

Search engines look at a particular variable name when a search is run to see what terms to look for.This variable will have a special name, which will vary depending on which search engine you have. Examples of these special names include 'q', 'qt', 'query', 'search' or 'querytext'. Most web sites grab the text in that variable, modify it, and then store it back into a variable with the same name, thus overwriting the original query.

The fix is pretty simple really: All we're suggesting is that, before you modify the query, make a copy of it to another variable.

So for example, in pseudocode:

Get variable querytext from CGI input;
Original_querytext = querytext;
Querytext = query_transform_function( querytext );
Run the modified search;
Call search logging engine with Original_querytext (and other results statistics)

The actual means to do all this is not only vendor specific, but also varies with your particular search framework.

You don't think you have this problem?

Are you sure? Are you using the default logging facilities built in to your enterprise search engine? And do you make ANY modifications to what the user types in? If you answered 'yes' to both of these questions then you probably ARE having this problem. We'd certainly sleep better if you would at least double check.

And the one last argument we here from some companies:

We can just filter out all that extra stuff if we ever need to'. What!? If you ever need to!?!? Think of the valuable knowledge available to you that you are missing!

Search Analytics, practiced properly, is an ongoing endeavor which we will elaborate on in the coming months. Administrators who cling to this rationale would, in theory, have to keep doing this 'filtering' over and over and over again.And since this isn't fun or automated, it's less likely to get done, which puts their whole chances of doing ongoing search analytics into doubt.

And besides, even if you did write a script, what if your query transformation rules change over time? You'll need extra rules to strip out the rules from last September's transforms, vs. the extra stuff that the March transform rules added. It's just so much easier to store it as is when you get it.

Or some people tell us they can just skim the reports and ignore all that extra stuff. Frustrating: humans are notoriously bad at doing that sort of thing, and even if you can master that skill, there's much less chance your boss or the Marketing team will be so inclined. Making human brains work harder to ignore superfluous data makes them tired, and more likely to drift or avoid that activity all together. And even brainiacs are more likely to miss some trends.

Action items:

  • Verify that your searches are being logged at all
  • Verify that you're capturing the original user query

And most importantly: Keep reading our new column! And do drop us a line if you have any questions about your own search analytics issues.

In our next issue: What else should you be logging?

Previous articles about 'Query Cooking' and filtered searching: