Back to top

HP, Autonomy and Meaning-Based M&A

At Search Technologies, a good number of our staff have been working in the enterprise search business since before the foundation of Autonomy in 1996. 

Some of us occasionally show our age by looking back at the second half of the 1990s as something of a golden age for enterprise search. In truth, that period is probably no more remarkable than any other in the industry's history. 

Pertinent to the subject of HP and Autonomy, the late 1990s can be remembered as the time when enterprise search came of age. Large organizations were, for the first time, indexing multiple, heterogeneous data sets into a single search experience, with full document-level security, using search engines such as Verity, Fulcrum, RetrievalWare and an early version of Autonomy IDOL.

THE RELEVANCY WARS BEGIN
This greatly increased the numbers of documents being indexed, and it brought search to new audiences. In earlier times, search was something that librarians and information scientists did using Boolean syntax. But all of those "knowledge workers" newly exposed to search just wanted to type a few words into the box.

The combination of lots of documents under index together with simplistic search queries, made relevancy ranking a key battleground. In other words, it is great to be able to search, but finding thousands of results is not helpful, unless they are presented in the right order.

SO WHOSE SIDE ARE YOU ON?
At a very high level, vendor approaches to the issue of relevancy ranking could be split into two camps:

  • The semantic folks included Verity with their "topic trees," and RetrievalWare which used "semantic networks" to enhance the search experience
  • The statistical folks were led by Autonomy who, over the years, are not the only firm to have promoted the statistical approach, but by a country mile they are the most successful

So what's the difference?

Fresh as we are from witnessing the US presidential election, in which negative messaging played a leading role, let's illustrate a couple differences from such a perspective.

 

The semantic approach automatically enhanced the user's query with synonyms and other related terms (increasing recall), and then sought to control this by also involving semantic relationships in the relevancy calculations - a measure designed to balance the increased recall with better precision.

A former Excalibur Technologies (owners of RetrievalWare at the time) sales person among us, remembers learning that the search logs on Excalibur's Web site recorded occasional searches for the word toilet. Despite there being no information on the site about bathroom furniture, this search nevertheless retrieved results. These search requests were rumoured to originate from UK-based Autonomy sales people who were showing it to their potential customers. In American slang, the word "john" can be considered a synonym of "toilet." At the time there were one or more Excalibur executives answering to that name who were documented on the Web site. An overly detailed semantic network did the rest.

Autonomy 1 : Semantic Guys 0

 

The statistical approach used by Autonomy made no attempts to understand word meaning per se. Instead, it simply understood statistical relationships between terms. To this day, the frequent use of "Meaning-based......." sits uncomfortably with many search industry insiders, given the fundamentally statistical basis of Autonomy's approach. Statistical methods were promoted as a means of improving relevancy, but without the need to maintain supporting semantic resources, which can be hard work. In other words, this was a more automatic, out-of-the-box solution.

A veteran of the time who is currently in our employment remembers winning a victory over Autonomy back then, while working for one of the semantic folks. The test data set used by a potential customer in a relevancy bake-off, was a bunch of recent news items. Unfortunately for Autonomy, the news data set covered the period of the 1996 Dunblane school massacre and this skewed the statistics. It was a huge story in the UK at the time. The customer is reported to have found that searches for school made hits on documents which were largely about murders because of the strong statistical relationship between those words within the test data set.

 

Autonomy 1 : Semantic Guys 1

A score draw is a fair result. Neither approach is perfect for all applications, but both have merit.

HORSES FOR COURSES
Still today, both approaches have merit. Indeed, the overlap between the two is considerable in many contemporary search applications. Some systems work best if vocabulary centric but supported statistically, others excel based on a statistical core, supported by semantic enhancements such as stemming rules, and synonyms to normalize concepts before the stats engine springs into action; for example HP = "Hewlett Packard" and SFO = "Serious Fraud Office".


PHILOSOPHICAL DIFFERENCES
In the bigger picture, the philosophical approach of Autonomy has been differentiated from that of the semantic guys in an important way. Autonomy has pushed hardest at the "out-of-the-box" and "let the black-box technology take care of it" approaches to the sales and marketing of enterprise search. Autonomy are by no means alone in this respect. Overselling of off-the-shelf capabilities has been a constant theme of the enterprise search industry since its inception. Search Technologies, a services company focused on helping customers to make better use of search engines, was largely founded to address the expectation gap between the promises and the realities of enterprise search software implementation. 

Still today, out-of-the-box is a message that potential customers find extremely attractive. They want to believe in it.

The key point is this. Autonomy were so much better at it than any other company. To some extent, their success must be attributed to superb marketing around the "meaning-based" theme, backed by an aggressive and highly professional sales operation. 

SO WHERE DOES THAT LEAVE US?
Autonomy IDOL remains an extremely capable and widely used search engine. It is our collective experience that some customers bought too heavily into the meaning-based vision, and probably paid too much for it. We work with such customers to help them make the most of their investment - and much can be made of it. IDOL is a very detailed and capable search engine. It has all of the bases covered, which is not something that can easily be said about other current market leaders.

We are also seeing occasional requests for migration strategy planning, especially from companies who have CXO-sponsored SharePoint initiatives underway. But this is by no means a flood.

We doubt that there are many enterprise search industry veterans who are surprised about HP's current troubles. It seemed clear at the time that HP were paying a little too much for Autonomy. Was the reason for this something to do with book keeping?  We have no idea.

MEANING-BASED M&A
Perhaps, like many an Autonomy customer before them, HP bought into the Meaning-based vision, big-time, and in so doing, acquired a solid, and highly detailed search platform, capable of addressing pretty much anything that involves searching.

But at an inflated price.


Footnote
This article is written from an enterprise search perspective, because that's what we do. The company bought by HP in 2011 was much more than that, having built a range of vertical solutions businesses, many by acquisition, during the 2005-2010 period.

0