Back to top

Virtual Documents: “Search the Impossible Search”

The perspective of the searcher is often not well served by existing content structures. 

Gaining an understanding of user requirements is an important part of any non-trivial search engine implementation project. At Search Technologies, we usually do this as a part of our Search Assessment process, which also builds a clear picture of the data landscape. Once the environment on both sides of the search engine has been understood, we can set about building a search application that creates business value. 

From time-to-time, user requirements imply a need for what might be called the impossible search. We recently came across an excellent example of this, as illustrated in an anonymized form below. 

The application was part of a generic intranet search facility, and the customer concerned was in the professional services business. This was a large company with numerous offices, serving multinational clients. When a new contract was won, it was necessary for a Program Manager to assemble a team to deliver the project. There was a tendency for Program Managers to strongly favor local staff who were known to them. This was not ideal. Situations occurred where offices at 100% capacity were taking on temporary local contractors, while other offices had spare capacity. The management was frustrated with the low staff utilization rates. 

Although there were cultural reasons for this (numerous mergers & acquisitions in recent years), there was another barrier too. Faced with a new (and usually unique) combination of requirements for a project, the Program Manager had no easy way to search for staff, because pertinent information was distributed across a number of business systems and document repositories. 

The company had done a reasonable job of unifying business systems following acquisitions, but this was not enough. Important information about staff capabilities also resided in unstructured documents. 

The solution was to build an indexing pipeline specifically to address this user requirement, by creating “virtual documents” about each member of staff. In this case, we used the Aspire content processing framework as it provided a lot more flexibility than the indexing pipeline of the incumbent search engine, and many of the components that were needed already existed in Aspire's component library.

Merging was done selectively. For example, documents were identified that had been authored by the staff member concerned and from those documents, certain entities were extracted including customer names, dates and specific industry jargon. The information captured was kept in fields, and so could be searched in isolation if necessary. 

The result was a new class of documents, which existed only in the search engine index, containing extended information about each member of staff; from basic data such as their billing rate, location, current availability and professional qualifications, through to a range of important concepts and keywords which described their previous work, and customer and industry sector knowledge.

Prior to the implementation of this function, Program Managers faced an arduous task if they wished to find suitable staff from other offices. This process started with some basic searching, but then involved a lot of telephone-time to cross reference information and check on missing details. Following the implementation, the Program Manager was able to quickly formulate a much shorter short list, before contacting individuals. Program managers saved time, and the company's utilization rates improved.

Thanks to the concept of “virtual documents”, they could search the impossible search

PS: Apologies for the cliche title. The day before I wrote this, a family conversation reminded me of my favorite TV commercial, Dream the Impossible Dream  Yet the only Honda product I current own is a lawn mower...