Search this site:
Enterprise Search Blog

Do You Need a Taxonomy?

The first clarification to make is what type of taxonomy are you considering?

Subject Based Taxonomy

Created by domain experts, sort of like the Dewey Decimal System

Content Based Taxonomy

Organizing the specific data you already have, sometimes amenable to automation

Behavior Based Taxonomy

Driven by search analytics, user tagging or vocabulary analysis

Quick vocabulary refresher:

A Taxonomy is an organized set of concepts or definitions, usually labeled by keywords; for search engines, a taxonomy can also be a set of organized searches. Taxonomies are typically nested in a hierarchal manner, often called a tree.

We will cover the difference between Taxonomies and Ontologies in an upcoming article.

Taxonomies are a powerful tool that can form the cornerstone of advanced search applications.  However, they are not a magic fix for all search problems, and can actually be a false start for applications.

Problems that haunt Taxonomy projects:

  • "Taxonomies as a Symptom" - a general desire to improve relevancy, in search of an appropriate technology
  • Mis-application of Taxonomies - perhaps some other method of improving search would have been more appropriate; similar to above, but discovered later in the process.
  • Source data misaligned with selected technology/vendor
  • Overly zealous vendor promises
  • Unrealistic expectations of performance or staffing requirements - often related to vendor claims
  • Misunderstanding complete taxonomy workflow
  • Underdevelopment of Taxonomy UI integration - browse and refine modes
  • Unbounded project scope, often related to previous points
  • No pilot project or POC
  • Fails to meet user satisfaction or adoption targets

A general Taxonomy project roadmap might look like:

  • SCOE Staff briefings
  • Identify initial project scope
  • Select appropriate technology/technologies
  • Design of critical Taxonomy Workflow Stages
  • Establish initial QA criteria
  • POC
  • Phase 1 Implementation
  • QA and User Acceptance testing
  • Taxonomy Admin Staff training

The Taxonomy Workflow Stages should minimally include:

  1. Acquisition or Creation of initial Taxonomy Ruleset
    • These are the rules that tell the system which document goes under which taxonomy branch.  In automated systems this might be called training.
  2. Data and Vocabulary Alignment with Taxonomy Ruleset
    • If you acquire an industry specific taxonomy, it will likely have certain assumptions about the vocabulary used in the source content; this may need adjustment and augmentation.
    • If you're building a taxonomy, you may want to leverage specific terms, Meta Data and URL/directory structure in your ruleset.
    • Be somewhat cautions of vendors who claim you can skip this step.  Be extremely cautions of vendors who do not allow you to touch this.
  3. Integration of Ruleset Engine
    • Once rules have been created or "trained", they must be applied to newly arriving content on an ongoing basis.
  4. UI integration of Taxonomies

Of course the specific steps and order will vary.  For example, if you create your own taxonomies, you'll need to do the data and vocabulary alignment first.

That's a lot to digest.  If you still have a question, drop us a line!