Improved relevancy directly enhances user productivity and core business objectives
Search Technologies provides a services engagement for improving the relevancy of search results within an existing Solr Lucene implementation.
This engagement will provide powerful relevancy ranking improvements in an existing Solr installation. This includes setting up a basic system for relevancy evaluation, based on a set of sample queries, so that improvements can be quantitatively measured.
Additions to the default relevancy formula in Solr Lucene can dramatically improve search results, solving many of the most thorny relevancy problems. For example:
This service can ensure that open Solr-based search applications provide highly relevant results to users. Improvements in relevancy can transform the contribution that a search application makes to a business process.
EXAMPLE FEATURE DESCRIPTIONS
Every services engagement is treated differently, taking full account of the objectives of the application. The sections below illustrate two important methods of Solr relevancy improvement that are often appropriate to a customer's needs.
Parameterized Document Similarity Function
Default Solr Lucene systems are based on a fixed document similarity function that depends heavily on term-frequency / inverse database frequency (tf-idf) statistics. These default implementations put too much weight on document sizes (boosting small documents) and rare terms in relevancy calculations. Search Technologies provides parameterized versions of tf-idf giving substantially more control over the relevancy formulas. This new operator has configurable parameters to determine the exact amount of boost for tf-idf ranking factors and also provides upper and lower thresholds that reduce the effects of unreliable statistics at very low-granularities (when terms only occur in a few documents).
Note: Versions 1.4.1 and 1.4.0 of Solr will require a source code patch to implement the Parameterized Document Similarity Function. Releases currently in development (expected to be numbered as version 3.1 or later) can be implemented via a configuration change and a drop-in library.
Gradient Proximity Boost
Default Solr Lucene systems have a very limited “hard window” proximity boost. If all terms are “within window” the document will receive a fixed boost multiplier. If any term is “out of window” no boost is applied.
The Search Technologies Gradient Proximity Boost operator instead measures the density and completeness of terms across the document. Documents in which terms are clustered close together will be boosted more than documents in which terms are widely distributed, but in a gradual way. This operator eliminates the need to tweak fixed window sizes.
A working Solr / Lucene system with documents already indexed.
TYPICAL ENGAGEMENT TASKS
Search Technologies is able to provide software maintenance and support services, including 24 / 7 options, both for the newly installed operators or for Solr Lucene as a whole.