Search this site:
Enterprise Search Blog
« NIE Newsletter

2006: The Transatlantic Titans Square Off!

Last Updated Mar 2009

By Mark Bennett, New Idea Engineering, Inc. - Volume 3 Number 1 - January 2006

Autonomy, based in the UK, has completed its acquisition of search powerhouse Verity. Oslo Norway is the home base for FAST, which continues to expand its presence in the North American market. With the Europeans now in control much of the Enterprise Search market space, how will the US "Big 4" respond? And what does this mean for Asia?

Disclosures and disclaimers:
Since this article deals with forward looking predictions - our best guesses - we wanted to remind our readers that our staff has worked at and/or had financial dealings with many of the vendors mentioned in this column. We have formal and infomal relationships with most of the companies mentioned here. Nonetheless, we strive to remain vendor neutral for the benefit of our clients. Also, because we are based in the US, much of our first-hand knowledge comes from this side of the Atlantic, so we apologize in advance if the article seems a bit "slanted" – actually you might be surprised by some of our conclusions. Also, we'd love to hear from you European and Asian readers, so don't be shy!

Setting the Stage

2006 is on track to be the year of the Transatlantic Titanic Battle for Enterprise Search. As we mentioned in our 2005 top stories article, Europe's Autonomy, based in the UK, recently grabbed up the US-based leader, Verity Inc. This puts all 4 of the mainstay enterprise search product lines across the Atlantic.

Recapping the EU and US players:

The 4 established enterprise search product lines are now in European hands:

  • Autonomy:

    Autonomy's own flagship product line

    Verity's well established K2 product line

    The Verity Ultraseek product line - which has changed hands several times

  • FAST Search and Transfer:

    FAST's flagship FDS and InStream products

The US has four big players of its own:
  • Google

    Google Search Appliance

    Google Mini

  • Microsoft

    Bundled search inside of server products

    Search within SharePoint

    Search built in to their next major OS "Vista" (formerly Longhorn)

  • Oracle

    Oracle Text integrated with Oracle database products

  • IBM

    Omnifind

Many other Niche Players

The US also features privately-held Endeca, which is making inroads into Enterprise markets. Significantly, Endeca is the only US search manufacturer to make it into Gartner's top "magic quadrant" in 2005.

We mean no slight by not calling out the many other US and European search technology companies. If we were to list all of them, this article would double in length. The list above is either the currently established players, or the "800 pound gorillas" who hope to claim that title. Some of these other niche players see themselves as the next must-have total enterprise search solution; other vendors play in a more limited space, offering specific pieces of technology to more demanding projects.

Is this a Fair Fight?

If you look at the combined market cap of the 4 US companies, and compare it to the market cap of the 2 main European players, it comes to a ratio of about 200:1 That would seem to be a daunting number for the Europeans. However, Enterprise Search is not the core business (read: revenue stream) for any of US companies; this isn't a market they "must" win, so we believe they will continue to struggle.

Just to be clear, while Google's public search is certainly great, it is not by itself an "enterprise" search implementation; it serves the entire Internet, not the content on private corporate networks. The "Google Box" is their entry in this enterprise search game, and it's discussed below.

Pure Search

Autonomy and FAST are extremely focused on search. We've always referred to FAST as the "bring it on guys" when it comes to any search application. And Autonomy now holds the other 3 of the top 4 engines. This is very important.

And make no mistake, when a business needs to search a repository with millions of documents, they will likely be talking to Autonomy or FAST; their products are well known for scalability.

These "David"s also have a few secret pebbles to cast. While the US "Big 4" are still struggling to get market penetration with their commodity search, FAST and Autonomy are continuously raising the bar on what basic "search" should encompass. We'll be detailing some this in upcoming issues.

Sidebar: The Journey of Ultraseek

Ultraseek has a long history and loyal customer base that loves this engine, and despite changing hands several times, it always seems to land on its feet. Ultraseek has now been owned by four different companies:

  • Created at Infoseek in the 1990's:

    Essentially Ultraseek was a shrink-wrap packaging of the original Infoseek portal technology.

  • Sold to Inktomi in the late 1990's:

    Inktomi used Ultraseek to round out its search technology, as a natural fit to their high end caching technology.

  • Sold to Verity in early 2000's:

    Verity already had their own Search97 and K2 product lines. But Ultraseek had a great reputation for being easy to setup and administer, so Verity kept the product alive and even aggressively marketed it. Recent updates had even started merging some of the best features of the formerly disparate Ultraseek and K2 product lines.

  • Now sold to Autonomy in 2005:

    When Autonomy acquired Verity in 2005, Ultraseek was also part of the deal. Autonomy has made no secret of their respect for the Ultraseek product line and has comitted to actively maintain and enhyance the product.

We love Ultraseek too! We have worked with it since 1997, under all three of its previous brands. We still encounter Ultraseek on various web sites and at client sites, it still has an active user base, and we expect it to be around for a long time to come. We do suspect that, for really high end customers and very large data sets, Autonomy will tend to push its own homegrown technology.

A Detailed Look at the Players

Tactical Challenges for the Europeans

We do worry a bit about some of the ground level issues FAST and Autonomy face in the US market. Problems we hope they can avoid:

  • Visibility in the US: Both companies now have major offices in the US.

  • Less responsive: Do the US offices need to check in with the European Mothership before taking action. Will they be given the "autonomy" they need to function effectively in the US market.

  • Micromanagement of budget and staffing in the US

  • Different market: Do US prospects ask different questions than Europeans? For example, do US firms ask different questions about Quality of Service.

  • Product complexity: With the possible exception of Ultraseek (now part of Autonomy), all these products take a while to fully comprehend and implement. Many of the US "Big 4" offerings have similar challenges of course.

Below we list some specific thoughts about each of the players. First up, the Europeans.

FAST Growing Pains

Visibility and Access in the US

To their credit, FAST maintains a rather large presence in Massachusetts. There's plenty of corporate brain power resident there, so we believe they will be responsive. This location puts them in the strategic US Eastern time zone, and just a short hop from New York and New Jersey. And of course they have other offices major US markets. FAST does need to expand their US training program locations. Trudging to Boston for training this time of year certainly doesn't excite anybody. San Diego, San Francisco or Miami might get students' attention.

Complexity of Technology

FAST is an extremely powerful technology, but with all of their recently added US customers, they are straining a bit in the US.

As much as we love FAST, it's not a simple "drop in" solution – this is quite different from the idea of the Google Box. However, trust us, if your application needs FAST, then the Google Box wouldn't have worked anyway.

Like any high end software, FAST systems need highly trained workers to develop and deploy solutions. Getting all these new customer projects implemented will take lots of hands. In the grand scheme of things, fast growth is probably one of the more pleasant business challenges to have.

FAST is aggressively pursuing implementation partners; we certainly laud this approach. FAST also wisely offers high end search hosting, as an option for customers who need to ramp up more quickly.

Action Items:

  • If you haven't seen FAST in action yet, you might sit in on a demo if you get the chance.

  • If your content is headed north of 10 million documents, this should be one of considerations on a fairly short shopping list.

  • FAST also offers an OEM version called InStream, it may fit some of your projects better

Autonomy Growing Pains / Verity Customer Retention

Visibility and Access

Autonomy's acquisition of Verity includes Verity's posh corporate headquarters in Sunnyvale, California, right in the heart of Silicon Valley. We would certainly expect them to heavily leverage that location, along with their other field operations.

Juggling THREE Separate Product Lines

Autonomy is now the proud owner of three completely separate code bases, written by completely separate groups of engineers, and three somewhat disjoint sets of users:

  1. Their own Autonomy product line
  2. K2 - developed by Verity
  3. Ultraseek - originally developed at Infoseek

This is not a trivial situation to be in. The good news is that Verity has been maintaining and even enhancing Ultraseek for several years, and successfully retained some of the original engineers. They had even started merging some parts of the Ultraseek code base with their own K2 product line. So from a technical standpoint, the folks in California should be able to continue handling K2 and Ultraseek. And of course the fine folks in the UK have the namesake Autonomy code.

Support and professional services is probably the biggest challenge. We'd suggest a rather aggressive cross training program, vs. trying to compartmentalize and route calls, etc. A senior technical person who knows one search engine can certainly learn to support a couple more – we are living proof of this! Cross training provides a lot more flexibility in staffing.

Having call center staff in widely different time zones could also be worked into an advantage. Verity has had remote frontline support for years, so this will be nothing new to them. And both groups are used to supporting customers on multiple continents, so there's a lot of experience there. Autonomy will also get to pick and choose the best of the offices from each of the major markets both companies had offices in. This could work out really well.

Verity Customer Retention

With Autonomy's acquisition of Verity, many customers have asked us what we think. We don't think they are "nervous" yet, but certainly "intensely curious".

Autonomy Ultraseek Customer Retention

Autonomy has gone out of their way to say how wonderful and strategic the Ultraseek product is, and how it will form a cornerstone of Autonomy's lower end offering going forward.

This sounds really good, if they stick to it. Ultraseek has a very long and proud customer following and has survived its previous two transfers (from Infoseek to Inktomi, then from Inktomi to Verity). We really do hope Autonomy is sincere.

Verity K2 Customer Angst / Customer Retention

But what about K2? What's in its future? Autonomy has sent mixed messages on the product's lifecycle. On the one hand, they've said they will continue to maintain K2, but on the other hand their CEO has said something to the effect of "well, the integration of K2 indices into Autonomy's product is already complete".

This sounds a bit more tenuous, though not immediately alarming.

For companies that have recently renewed their K2 license, we say "relax". Sure by 2010, in 4 or 5 years, they should have migrated off of K2, but their recent Verity license renewal was probably only for 2 years anyway. In fact, Verity put quite a bit of effort into their new K2 v6.x, which continues to evolve their administration interface and feature set. Suffice to say that, for LONG TERM planning, Verity's K2 product should be absent; short term you're fine.

Although this uncertainly may be a bit unsettling for devout K2 fans, longer term these customers may feel right at home with the Autonomy product line. Like K2, it has many similar high end features. In fact these features may wind up being a bit easier to implement in Autonomy; some of K2's high end features required quite a bit of scripting and administration to access.

To Autonomy we say "Offer a concrete roadmap for K2 end-of-life, or ‘transition' if you must mince words." K2 customers are not stupid and they are NOT buying the current story(ies!). These are sophisticated accounts with money to spend, and they have no brand loyalty to you specifically. Meanwhile, FAST is coming on strong.

Rising Above K2?

Autonomy does have some other sexy, and possibly even useful, features to assist overloaded knowledge workers; sorry we can't give you more details at this time.

One forward looking area that we can talk about is handing non-textual content types such as audio and video. This isn't a must have item for many enterprise applications yet, but it could be in the next 5 years. Sure, audio and video editing has certainly become dramatically cheaper, for companies that need it. But more importantly, the mainstream portal sites like Google and Yahoo are adding video search to their public web offerings now. By itself, that doesn't justify a business case for implementing that in the Intranet; but we do believe that widely available public portal features do seed the minds of business executives, who may then see a fit within their own company.

Vertical search applications and media centric will see an immediate benefit.

In from the Cold

And then there are the stragglers… the folks still running Verity's venerable Search 97 and other legacy products. We understand, we see this all the time; we realize that upgrading technology is tough in some environments.

More importantly, these stragglers are certainly aware of K2, and probably felt a bit bad about not having upgraded yet. To all those folks we say "Rejoice! Now you don't have to!" But these groups do need to plot a new course forward, possibly with Autonomy. Properly positioned, Autonomy might even turn this situation to its benefit.

Action Items:

  • If you're currently using K2 don't panic. But do start thinking about your plans for 2008, 2010, etc.

  • If you're using older Verity products, you might want to sit through some demos of Autonomy and FAST – they have some new features that you might not have thought of before.

The US Big 4 Respond

The US "Big 4" certainly won't surrender this year! Google, Microsoft, Oracle and IBM all want to be in the Enterprise Search space. All of them can claim they are already there, to some degree, and some of their marketing departments may be convinced that they've already won!

They will all come out swinging hard. We expect their biggest push will be driven by marketing efforts, vs. technology; brand name and available cash can certainly buy mindshare. Google has a marketing head start in terms of search engine brand name awareness and perceived credibility.

We wonder if the marketing efforts of the Big 4 will attmpt to play the "fear" card, warning US companies of entrusting their corporate resources to those "Europeans"; this would be a mistake if they try it, it's just not credible. Those in the know realize that half the products the Europeans are selling originated in America, and ironically at least some of the search technology the US Big 4 are touting was developed in Europe. And Europeans do not have a reputation for being careless or reckless, in fact quite the opposite is true.

Another marketing strategy they could try is to simply "dissolve" this category of software. If a search engine is already embedded in the software you use, they will argue, then why would you buy something else at extra cost? Microsoft and Oracle are in the best position to make that claim.

This might work for "generic" search applications, but every client we've worked with has at least one very non-generic business problems to solve, or they have a need to integration with very specific existing applications. Data lives in many different systems from multiple vendors - we doubt customers will believe that Oracle, for example, can somehow magically search enable all those different repositories.

And embedded search has also failed at many companies. Embedded search has already been included in other enterprise software and hasn't always satisfied users; we sometimes find ourselves search-enabling repositories that already have a search engine built it, because the embedded search engine had intractable problems. Any Big 4 pitch touting embedded search to these folks will have to add the caveat "and this time it's really gonna work!" – sounds like a tough sell.

Google

"But we're GOOGLE for goodness sake! If we can search 100 Billion documents on the Internet, we can sure as heck handle your enterprise. And EVERYBODY LOVES GOOGLE!"

With a brand name now more recognized than Coka-Cola, it seems impossible that Google would have any problems selling enterprise search, and yet here we are. Although we can't name names, virtually every company we've talked to who has used the Google Search Appliance (AKA "The Google Box") has mothballed it. How can this be!?

Google's efforts will be helped by the fact that everyone's boss, and their boss's boss, and virtually every other VP and CEO in the land has heard of Google. At some point every one of them will have the "insight" to ask "Hey, why don't we just buy a Google Box?" Are you prepared to answer that inevitable question?

Google's Challenges

Here's some of the problems Google is facing:
  • The "secret sauce" of Google is their page rank algorithm, looking at how many other web sites cross link to specific pages – the assumption being "the more links to a page, the more important it must be". Enterprise Intranets do not have this type of organic linkage – top level content is linked an established hierarchy, based more on "org charts" than on the usefulness of the content.

  • Of course users still pine for Google. When clients tell us that "search sucks", the more detailed analysis inevitably winds up with users asking the IT staff "why can't our XYZ search engine be more like Google?" Ironically it seems that even Google can't meet the expectations of corporate users who've given them the chance. Inside private networks the lack of Google's "secret sauce" leave users with just another cheeseburger with pickles.

  • Google also has a habit they will need to reevaluate: Google's public offerings are often in a state of perpetual Beta. This is fine for free services where individuals can take it or leave it, but doesn't instill any confidence in enterprise customers. Our clients ask more about stability, performance and cost, with less emphasis on "innovation", though customers do like to see some type of forward looking "vision" from each one of their vendors.

  • And unlike Verity and FAST, Google enterprise offering is a "black box" of sorts (actually, yellow or blue, depending on which model you buy). Verity and FAST have always allowed, and often required, developers to tightly integrate their own code and customize their search engines. Google does not allow for this level of control.

  • And even from their own public product specs, their "search in a box" solutions don't scale up to some of the private search application projects we're seeing. On paper, the Google v2 offering certainly addresses some of its earlier shortcomings.

  • The Google Box also raises security concerns for IT managers and corporate security officers. Google's requirement that their box be hooked to a phone line is an all out deal breaker in some environments.

Anther factor in Google's expansion from public portal and advertising mega-star status to Enterprise Sales is their lack of enterprise sales expertise. Questions have been asked about whether their technical and marketing prowess can directly translate into enterprise product sales success.

Google took some steps forward in 2005

To their credit, Google has already responded to some initial technical limitations with their "v2" Google Box.

Google has also come up with a new marketing campaign to court the IT community, offering "white papers" about search technology. We think this was a good move, as they are selling to a very well educated community.

Action Items:

  • Your CEO will eventually ask about the Google Box, you should spend some time understanding its capabilities

  • If you're considering Google for your enterprise, ask about page ranking on a private network content
  • If you've got millions of documents this box may not be for you.
  • Ask about customization and integration
  • Ask about security

Microsoft

"But we're Microsoft for goodness sake! We make Back Office, which holds the data you want to index, and OS that it runs on, so surely we can handle searching your data."

Microsoft has wanted to break into search for a long time, because they see the revenue that Yahoo, and now Google is making on it. But once again, that revenue comes from ads in the public search space, not from enterprise search software sales; so this has more to do with the search on MSN, not with enterprise search.

Yes, Microsoft sells lots of software to enterprises, which accounts for a huge chunk of their revenue, and yes, having search integrated with that software certainly makes sense. Microsoft WILL continue to make some traction into those environments with their search. Microsoft SharePoint customers, for example, already use Microsoft's search engine. And Microsoft's web server has built in search capabilities. And look for them to continue issuing press releases about new search initiatives.

Their upcoming Vista operating system (AKA "Longhorn"), will even have search built into the core OS. This will certainly help their penetration into generic enterprise search. But most of the high end enterprise applications we work on are NOT generic – high end applications have very specific requirements and are HIGHLY customized – we don't expect Vista to address that segment any time soon.

Action Items:

  • Do consider Microsoft search if you're already a Microsoft shop

  • Think about how "generic" your search requirements are

Oracle

"But we're Oracle for goodness sake! Your data is already IN our database, so of course we can search it. Beside, our new 64 bit version is gonna rock!"

Sadly for Oracle, a substantial fraction of the data that enterprises need to search is not in their database. "Not YET," they would respond. And whether or not their 64 bit version addresses scalability concerns remains to be seen. Oracle's mistake was not buying Verity while they had the chance – what were they thinking!?

And like our previous 2 contestants, we say "follow the money". Oracle makes their primary dollars on database software and related products; fulltext search is certainly a related product category, but no their core technology. You don't see people running out to buy Oracle to solve a search problem. The companies who are using Oracle Text already had their data in Oracle; that's how it works, not the other way around. search is not their primary revenue stream, Besides, their recent Peoplesoft acquisition promises to keep them adequately distracted.

Action Items:

  • If you're already an Oracle shop, maybe take Oracle's fulltext product for a spin.

  • Consider your scalability requirements.

IBM / Omnifind

"Hey, we're IBM. And would you like some Blade Servers with that?"

We've heard some good things about Omnifind, although we're not sure how much time it takes to actually deploy.

IBM certainly has some quality software out there. We salute them for Web Sphere, and their contributions to Eclipse, Linux and other open source projects.

But since Omnifind is actually just part of their strategy to sell more hardware, it means this is not a "make or break" business for IBM either, so again we think it won't compete as effectively. Reiterating, we do hear good things about Omnifind, though in 2006 we'd actually like to play with it first hand.

Action Items:

  • If you're an IBM Blade Server shop, maybe take Ominifind for a spin

More Consolidation in Enterprise Search?

Consolidation is the sign of a maturing industry, and is often a net positive for customers. Like the database companies of yesteryear, we expect the enterprise search engine consolidation to continue. This market trend may a bit murkier to follow, because overall the phrase "search engine" is much broader and includes the web portals and vertical search sites, and this is not the enterprise search market we're talking about.

The Big 4 US firms certainly have the capital to acquire anyone they desire. And Verity (now part of Autonomy) certainly has a track record of acquiring other software companies. As we mentioned earlier, there are plenty of other companies in this space, it's certainly a buyers' market.

US Market Listings

Our staff financial guru believes FAST and Autonomy would benefit, from a visibility standpoint, by being cross listed on one of the US exchanges. We've heard nothing about this actually happening of course.

Mondosoft Acquired?

We suspect Mondosoft, another European enterprise search vendor, will eventually be acquired, possibly under favorable terms. Rumor has it this would suit them fine. FAST or Autonomy would make a logical home, but if they take too long somebody else will scoop Mondo up. We're not so sure US companies would find Mondo a good fit.

What About Asia?

Vocabulary: "CJK" (Chinese, Japanese and Korean) Refers to information systems that handle these and related Asian languages.

With all this focus on the US and Europe, what about the East? Sure, the events of 9/11 put a temporary focus on Arabic. But Mainland China is coming online, in a HUGE way, and it is becoming the driving factor in expanded CJK support; the main focus is to support Simplified Chinese. China's economy will eventually be bigger than the US's, at least in some sectors, and their citizens often do not speak English.

The expectation of search functionality and localization has been steadily rising. This is not 1995, and just supporting the common CJK encodings and "Unicode" is not the same as having a complete CJK Asian product offering. And for companies doing business in Mainland China, there are additional filtering requirements, and potential criminal penalties for getting it wrong.

India's economy is also coming online, with its own native Hindi speaking population. They also need some attention.

In a future issue we will detail what real support for Asian languages should mean.

Good News for Basic Chinese Search

Other Asian economies are already "online" and have been deploying search software for quite some time. This means that at least basic CJK support is widely available.

If you work in a multinational or medium to large company and you don't handle Chinese content yet, you probably will soon. "Chinese" in this context typically means "simplified Chinese" (vs. "traditional Chinese"). The good news is that all major search engine and database vendors now support Unicode, and can convert data from other encodings.

Encoding Issues

As long as your original content is correctly and consistently encoded, the software you'll use in 2006 will probably handle it. Windows, Unix, Java and Python all handle Unicode as well, although they vary in the specific details of how it is encoded. Contrary to what some casual observers might think, there are different "dialects" of Unicode, in terms of how the actual bytes are stored and transmitted.

It often comes as a surprise to casual observers that Unicode has expanded beyond its original 16 bit 65,535 character grid. Most Unicode applications still fit into the original 16 bit bounds (called the BMP, or "Base Multilingual Page"), but a few applications need the later more expanded character set.

We've also found that companies are often surprised by how inconsistently encoded (and often incorrectly) their multilingual content is. The litmus test of "well, Internet Explorer can display it" is not even close to sufficient; IE is very forgiving, other programs are often not.

Action Items:

  • If your application's requirements go beyond the usual 16 bit Unicode BMP (Base Multilingual Page) then you'll need to ask vendors much more specific questions.

  • If you've never even heard of BMP's or UTF's and are planning a multilingual application, you have a bit of homework to do.

  • When you ask vendors about Asian language support and they reply with a quick "no problem", or "sure, we support Unicode", ask some follow up questions! There's more to multilingual support than just supporting Unicode.

  • Hire or partner with some native language speakers. Do NOT trust your Chinese, Hindi and Arabic development and QA solely to English speaking staff (or other Romance language speakers). The problem is that badly mangled and completely mis-encoded Asian text still looks like "Chinese" to your average American engineer, let alone distinguishing Simplified vs. Traditional Chinese, or sorting out Korean from Japanese content. Westerners can rarely tell the difference!

    I ask clients to think about how Westerners make fun of badly translated instruction manuals that come with imported products – do you want your foreign customers to have the same impression of your company!?

Conclusion

2006 will certainly be an interesting year. The traditional ENTERPRISE search powerhouses are now in Europe, but the US boasts the software industry's largest behemoths. Let the transatlantic battle for mind and market share begin!

We say "follow the money". Aside form Google, the other US players are not making their primary revenue from search. And Google's "search" revenue is actually advertisement dollars from their public portal, not software licenses. US firms have the marketing edge, and will push hard, but if the Europeans can continue to execute, they will remain very competitive.

And finally, amid all this US / EU attention, Asian markets should not be taken for granted by either side. They have money to spend, lots of brainpower, and rising expectations about what is possible. Hindi and Arabic content also needs quality search solutions.

If you disagree, please write us! We'd love to include your viewpoints and letters to the editor. For example, if you're a happy Google Box user, please let us know. Or if your company has had Autonomy deployed in large projects, we'd really like to hear about it.