Enterprise Content Browsing
Browsing for content is innate, so leverage it
How do you store files on your hard drive?
Humans were storing information in hierarchical structures long before the advent of the personal computer. If you are young, you may never have used a filing cabinet - a physical manifestation of a 2 or 3-level storage hierarchy. Yet you probably still arrange your computer files within some kind of hierarchy.
Although most business folks do use search over their personal content store, the default behavior is to browse.
VALUE IN HIERARCHIES
Most people build their own hierarchy for personal documents while departmental or functional teams cooperate to build shared hierarchies for their projects.
Folder names are generally meaningful, and the complete file path to a document can tell you much about its content, background and purpose. For example, even if balancesheet.xls only contains numbers, its physical location could be informative.
The browsing journey also provides insight into related documents and other information available within the same folder, or perhaps in related folders.
THE IMPORTANCE OF METADATA
Almost all advanced search features rely on metadata. For example:
- Search navigators, a cornerstone for most sophisticated search apps, are totally dependent on metadata
- Results sorting by date, price, location or rating is impossible without fielded metadata
- Infographic results display similarly and rely on metadata to drive informative analysis and insight. Think hotels on a city map, or a trend graph of sentiment about a brand
Yet interest in metadata is declining. http://www.google.com/trends/explore#q=metadata
The sad truth is that most humans are somewhat self-interested. They are generally disinclined to volunteer metadata unless it serves their own purposes in some way. After all, metadata means extra work, and we are all busy.
THE METADATA DILEMMA
So, on one side, humans are disinclined to create it. On the other, we need it to drive findability.
Most office documents have little or no internal metadata. The solution lies in getting two things right:
- Generating new metadata by automated methods using entity recognition, categorization and related techniques which have been appropriately chosen to suit the data set
- Capturing whatever useful metadata does exist in the source documents
In many repositories, the most plentiful existing metadata is encoded in file names and file paths.
ENTERPRISE CONTENT BROWSING
In the same way that individuals browse their personal hierarchies, with the right approach to metadata capture, we can leverage all of the existing storage hierarchies by making them an integral part of the search experience.
In other words, piece together an enterprise browsing hierarchy, but with full document-level security, and with the ability to switch easily between search and browse modes of information discovery:
- Start with a search, and having found an interesting document, look for related information by viewing other documents within the same folder, or browse to nearby directories
- Or start by browsing to a specific folder, and then deploy a search over all of the information in that folder or below
- Mix and match. Being able to interactively switch strategies, in our experience, makes for happy search users and business productivity
Users can then relate to enterprise search in the same way as they do to their local hard-drive. In other words, innate information retrieval behavior can be extended to the enterprise.
As data volumes continue to explode, we believe that this hybrid approach is a highly effective and pragmatic way to ensure search effectiveness. This can be achieved using any of the leading search engines.