Search this site:
Enterprise Search Blog
« NIE Newsletter

Understanding the “Content Supply Chain” - Interview with Rennie Walker

Last Updated Apr 2009

By: Mark Bennett

Hello Rennie, thanks for chatting with us.  First off, when people hear “Content Supply Chain”, their first question is likely to be “What is it?”

The Content Supply Chain is a model. It’s a model that allows you to think about the series of tasks needed to take content from its initial conception as an editorial idea to its being found and used from within a search based interface.

Without a model to help us uncover what is really going on with our content – as the user experiences it – we are only left with abstract business problem symptom names.  Like “information overload”.  A symptom name is useful, as a starting point, but it provides no way forward for analysis and task description.   Models trumps symptom names every tine in terms of moving forward.

The content supply chain begins with an originating idea - when people have ideas about what content needs to be written for users or customers.  .  In the next stage the content is written by authors.

But the process is not finished when the author is DONE WRITING!   Absolutely not!  The content then needs to be described and cataloged; it needs Meta Data, categorization, etc.  In many organizations there is both highly de-centralized authoring and the author is often also responsible for the description, which is both a separate knowledge task and a separate skillset.  There is a vulnerability there that is not shared by organizations that have a separate Description Work Flow.

Bottom line, what we are really wanting to do is add a layer of lightweight human intelligence, semantics and ontology into all the applications that handle the content - into the search application, the content management system, etc.

This small amount of extra effort has huge leverage.  Just 2 or 3 knowledge workers can optimize the content and the search engine for a company of 200,000 employees.

So what would a full Content Supply Chain model look like?

This is a process, with intermediate steps, that, depending on your project at hand, will usually include:

  • Inception / origination of content
  • Writing
  • Description / initial metadata
  • Publishing
  • Content process pipelines, metadata extraction, addition and normalization
  • Adoption / Utilization of the content
  • Feedback – from search logs, abandoned searches etc. and from qualitative user research.
  • Boosts / blocks, which happens way after the content is published – all the editorializing in support of the search front end user experience, such as upsell, cross-sell.
  • Maintenance, retention policies, obsolescence and compliance

Is this a standard now?

Not at this time. It’s a conceptual model, a way of thinking about process, not just some arbitrary XML standard.  Some people are writing about it, but it’s not a “standard” in any way shape or form, but I think there will be more pervasive movement towards this kind of model.

What has this got to do with Search?  Isn’t this just about data overload and “findability” ?

Well yes, it overlaps.  Obviously search engines are one of the key technological subsystems.  BUT if content doesn’t exist, it can’t be found!  And how would you ever know that?

And even if content does exist, but it’s not properly categorized, and it’s buried in a giant data set, then it probably won’t be found.  It’s the classic “needle in the haystack.”

We live in an age of data “overload”, in terms of the raw data, but if you don’t MANAGE this content, this PROCESS, it’ll be a problem.  This is more about understanding the data you have.  AND what you want to show the user and customer.  That’s what you have to model together.

Can’t this be automated?  Isn’t this just automated profiling?

It’s better to have a human to be part of the process.  It’s deeper than just automatic profiling.

You find out about the LACK of content by looking at the search logs and abandoned searches. You look at the string of attempts users made that SUCCEEDED.   These are all COGNITIVE tasks that computers can only ASSIST with.

Note that you move from application to application as you move through the content supply chain.

What are the top 3 to 5 challenges with CLC? (Content Life Cycle)

The number one challenge is the pervasive belief that you can just throw a search engine at the data and it’ll magically fix it.  Search implementations have 2 parts:

  1. The basic technology installation
  2. The ontological / content model, and its management and maintenance

If you skip the second part, you’re missing out.

Algorithms thrown against unmanaged content is the perfect storm for data overload and user dissatisfaction.

Even when a company does understand the need, you can have a real challenge in a highly distributed company.  If content creation takes place throughout the organization then it’s harder to put newer policies in place, since there’s no centralization.

There’s also an issue around managing metadata.  Some metadata is formalized, stored in a CMS and associated with content items.  But then the search engine pipeline needs to extract and/or add metadata to the content it’s processing along with usually combining that with metadata from other data sources.  It’s hard to get a global view of all your metadata.  That’s why companies like SchemaLogic exist – they see a business need, strongly emergent, and I think they’re right

And there’s persistence, or lack thereof.  You’ve got to be keen about search logs!  You’ve got to analyze them with enthusiasm, and on a regular basis.  A robust, formalized approach.  You have to LOVE your search logs, and not just when doing a search engine migration.  Constant love … all the time! 

Editor’s Note about the term “Appliance”:

In the following section “appliance” is not meant in a literal sense, it’s not talking necessarily about a piece of hardware that plugs into a server room rack.

Instead it refers to an administration model where the system requires very little ongoing maintenance or adjustments, what might be termed a “set it and forget it” model.  This is in contrast to companies that routinely and proactively monitor and adjust their corporate search engine.

These are two different usage models, with each possibly filling different business requirements.

What are the ramifications of just ignoring this problem, or being completely unaware if of it?  Won't it just "go away!"  What would you say to the “appliance” oriented customers and vendors?

Good luck!

If they’ve done due diligence and it’s a smaller company, then it’s possible they don’t need much.  But once you leave the “entertainment web” of the public Internet, with companies of any real size, everything changes.

People don’t understand what appliances lack.  They don’t understand the tasks and procedures that bring about good search.  All you’re going to get is algorithmic based search.  You’re not going to get a good view of your metadata and the navigators they could give you.

My concern is the “appliance” model throws us back into the algorithmic-based paradigm that is getting really old.  You are stuck in a very “shallow” model.  It’s not editorial driven, it’s not user experience driven.  I mean, we want to empower the business side.   Once the technologists are done with successfully deploying the technology, then we want the business-side to actively run the ontological/knowledge experience of the customer and user, day by day.  That can be harder with an appliance.  You want to be careful that implementing an appliance doesn’t take you back in time to the time when IT units had to do it all.

You’ll notice that by and large you don’t see generic appliances used on high end ecommerce sites.  They don’t give you drill down, upsell, cross sell, etc.  So they just don’t use that model.  The same logic applies, to some extent, to enterprise data.

There’s a MASSIVE level of user dissatisfaction with search in corporations, and it’s not going to give you anything.  It kind of conveys that you are going to be OK with a large lack of analysis and process.  That’s very likely not true.  I feel quite passionate about this.

Application vendors could do a much better job of telling customers about this.  Vendors oftentimes appear to have given up.

You seem to have a different take on “User Experience” - can you tell us more about that?

Sure.  I LOVE research-based user experience design!  BUT … the pitfall above all to avoid is not recognizing that what you are really doing – when it comes to search - is designing an ONTOLOGICAL experience, a knowledge experience.  What the customer and the user want is to execute a search, any search, and the results presentation maps exactly to that precise sliver of their own ontological world that they happen to be in at that particular moment.  When I’m talking about the ontological world of the user in the moment, I’m talking about their intent, and what pathways and facets their world has..  You need to know what type of navigators you need, what descriptive front-end artifacts you’ll have available, this will drive the type of drill down navigators you can offer.

And of course the UI is the realization of this.  Your UI design needs to go along with your model.  Designing the user experience, or what I’d prefer to call the “knowledge experience” or “Ontological Experience” is only successfully executed when you know what your customers and users want AND which points in your content supply chain contributes to what of that.

In the real world we don’t design like that, it’s often more reactionary.  It’s the wrong way round, but that’s how life is – we constantly inherit information problems and never have the luxury of a green field implementation.

You must audit your content supply chain, based on what you can know about your users quantitatively and qualitatively.  What’s in your search logs?  What are your abandoned searches like?  What are your data sources, and what’s the mapping of query frequency to your content. Which resources do your users ACTULLY go to and utilize?  If you find out a bunch of content that you don’t need, there are potential cost savings.  There’s a cost benefit to be had here; content is expensive to create and maintain, lower priority content can be trimmed back if not needed.

And there’s an efficiency of knowledge discovery as well when it comes to fact finding. A good ontological understanding on your part will give your customers and users a better chance as they search way down in the long tail.  Time saved is money saved.

What are some action items our readers can undertake?

Look at the outline of your content and current supply chain model.

Do the analysis to carry out either a complete content supply chain audit or a partial one.  Just the analysis at this point.  You don’t have to commit dollars beyond that.  Do an inventory of data repositories, and include an analysis of your metadata – both actual and what you really need, and how you are going to manage the actual metadata in your repositories and the virtual from your search engine content processing.

But more importantly, check your own ways of thinking.  Are you entrenched in a problem centric reactive model?  Or are you thinking in terms of solutions, a business mindset that allows you to think in terms of “models” vs. “problems”.  For example, complaints about data overload are certainly a problem, but it is NOT by itself a MODEL, it is not a guide or map on how to solve it.  Technology centric staff may not think in terms of knowledge architecture unless prompted.

Think about your current content and findability, the nexus of organization.  Like I said at the beginning, solutions come from a model, which helps you figure out what to analyze, and helps to form that into a plan.

About Rennie Walker:

Rennie Walker is a Bay Area knowledge management consultant specializing in applying human ontologies and semantics to information access applications.  Rennie was recently Enterprise Search Product Manager at Wells Fargo Wholesale Bank, Senior Knowledge Engineer at SageWare Inc., which was an early thought-leader in predefined cross vendor industry taxonomies, and he has worked for Deutsche Bank and Boston Consulting Group, in Europe, as an information access expert.  Rennie received his Masters in Psychology from Edinburgh University in Scotland.