Search this site:
Enterprise Search Blog
« NIE Newsletter

Maps for Enterprise Digital Content

Last Updated Feb 2009

By John Lehman, HighClassify - Volume 3 Number 5 - Fall 2006

We think of digital content in the enterprise as having a physical-logical layout - databases, files, directories, volumes etc. But that layout is for administering the content, not to enable users to find it. What enables finding information is a subject map, or meaning map, or ACCESS map or access architecture. No investment of time or talent in your organization will have a larger payoff; more profit, lower cost, better team atmosphere. . .than making your content accessible

So.. what is Digital CONTENT?

     

Our DATA BASES (customers, accounts, retail transactions inventory, employees, etc.) are important, but we are not really concerned with structured data by itself, because the DATA BASE 'DICTIONARY' IS its access architecture. Of course we want to be able to merge (federate) entries from our databases and our unstructured digital content when we go searching, and we need a much richer vocabulary than a data dictionary for the breadth of our access purposes.

Our Email, corporate internal documents, web content - in text, audio, graphic and video forms, is the vast majority of the body of knowledge we want to support us when the time comes for facts, opinions, and decision support.

What is a Content Access Architecture?

An access architecture for the digital content of the enterprise is the result of a multi-step, permanent process, although the amount of maintenance effort declines quickly to a steady state.

     

Step 1. Identify your architecture(s)

We see the following content architectures in our enterprises.

     

Since everyone uses at least one of these access architectures , they must have some benefits:

PROs CONs
Architects understood them but users do not
They use familiar organizational entities but content goes across all of these entities
Everything is somewhere but where?
They are compatible with all search software but does that guarantee good results?
Decentralized responsibility but does decentralized responsibility =
    crisp communication?

What is the best enterprise digital content access architecture? All of the architectures have some benefit, and they should all be used. What ties them together is step 2.

Step 2. Develop multi-perspective organization taxonomies

These are a set of hierarchical subject maps describing the ways to think about finding and organizing content by subject; i.e., -- the reasons for searching.

  • taxonomies fit everyone, regardless of why they are searching
  • taxonomies are both centralized and decentralized
  • Taxonomies fit with every enterprise search program
  • Taxonomies enable perfect results - precise and complete

We have discussed various taxonomy perspectives in previous articles(April 2003, June 2003, and July 2003). The following is an example the covers every organization at the top level.


Industry Segments A Marketing / Positioning / Competitive Intelligence Perspective. Industry Segments may overlap with Products & Services.
 
Organization Functions The divisions/units of a business or organization by function or responsibility. For example, within the Human Resources Function are a Skills taxonomy for recruiting full time or temporary employees. A legal taxonomy is attached to the corporate counsel function.
 
Business Relationships The types of other companies or organizations a business deals with, including competitors, customers, vendors, regulators etc.
 
Geography (Filter use only with other perspectives) Countries, International Organizations, Regions, States, Cities, Postal Codes, etc.
 
Business Issues & Events Economic, legal, labor, Merger/Acquisition, regulatory, environmental, safety, other government interfaces, etc.
 
Products & Services Products sold; MRO materials; indirect services, direct materials & services purchased.
 
Technologies/Sciences Applicable to the industry or industries in which the firm or organization participates. Basic or applied sciences are also included as appropriate.
 
Document or Record Types (Filter use only with other perspectives) - this perspective provides valuable reduction of results based upon the document's purpose and its connection to the information need.


This is not start-from-scratch activity. Every industry, business function, technology etc. has taxonomies already developed and available.. your job will be to adapt their terminology to best fit your uses, or to add terms to an existing structure.

Terminology is going to appear repeatedly in various places in the taxonomy, because you need to enable entry points for users who don't have the same way of thinking.

Content about a named product:

   To an engineer is architecture, tools, standards. . .

   To a lawyer is warranties, teaming agreements, OEM'd parts. . .

   To a marketing product manager is release schedules, features set . . .

   To the helpdesk is bugs, fixes, releases, workarounds. . .

Not every subject is suitable to be included in an enterprise taxonomy. If the subject can't be described and identified in content precisely, keep it out of the taxonomy. Once a taxonomy has helped a user dramatically reduce the amount of content to examine, then enterprise search can address less precise subjects: opinion, currency, and emotion.

Step 3. Classification of Content

Now that you have maps, implement your taxonomies with content interpretation processes or applications - connecting your content to your taxonomies - and remember to limit taxonomic elements to those with precise and complete rules. Whether you use people to classify (when content volume is small) or software to classify, your classification needs to exactly assign subjects to items of content (including database records). This is not the place for less-than-exact match relevance determination. Also consider the unit of content that makes sense from the users perspective. If one paragraph from a web site is meaningful to a subject, then that paragraph should be the result - not the whole web site.

Step 4. Enable enterprise searching

The best way to exploit the classification step is either lookup via taxonomy or smart search, defined as search that gives you more than string match (expansion, search memory etc.).

Step 5. Provide search results feedback to steps 2-4

Search results will identify weakness in taxonomy, classification and the search itself, but feedback about what worked and didn't work can improve all of those steps. There are many software tools available to assist in maintaining taxonomies and classification (rules). The process is ongoing, because our enterprise objectives and functions are changing, and so is terminology.

If it is so easy, why doesn't everyone do it? The major concerns about this multi-step process are summarized below:

a. Content Access Security?
Discretionary access control can be incorporated into the taxonomy, providing both power and flexibility.

b. Some advanced software program does it all?
This process cannot be successful without design and direction from smart, experienced, motivated people.

c. Updates? / changes? / the march of terminology?
This approach helps you know what to change; a myriad of good tools to help.

d. Who does this stuff?
Information architects, enterprise librarians, and standards organizations such as Rosettanet, UCC, UNSPSC, etc.

e. The decades it will take?
Much of the heavy lifting is already done. You can be implemented in weeks to a few months.

f. What is the UI to a complicated taxonomy?
First search the taxonomy, display the locations in the taxonomy where the search criteria exist, then pick the best fit(s), then search the content.

In closing, this process has the biggest single impact in the content management/portal space. It is an investment, not a product to help you make money, save money, improve team building and enterprise communication.