Back to top

Data Analysis Services for Search Applications


  • Most search applications can be significantly improved through data analysis
  • For business-critical search applications, data analysis is a key part of the implementation and tuning process

As the Internet matures, many companies’ entire product inventory and sole value proposition is their electronic data. This includes digital publishers, patent and academic research resources, industry-specific Web portals, online classifieds, directories, E-commerce sites, and many more. Data quality is critical to the revenues of these companies, and they rely heavily on search applications to deliver value to their customers. 

Search can be equally critical to large enterprises with years’ worth of corporate intellectual property stored in a myriad of legacy intranet repositories. Behind-the-firewall enterprise search applications have a reputation for under achievement. They have to deal with a full range of user types, a disparate (often huge) range of document and data sources, and a constant rate-of-change to data repositories, security schemas and application requirements.

The challenge in creating powerful (and cost-effective) search applications for all of these types of organizations lies in understanding the data and how it is used by all of the different people who need to search it. Extracting the critical metadata, understanding and preserving the relationships among data elements, normalizing highly disparate data, and enriching the data with additional information from external sources are the types of data preparation and processing activities that will lead to a superior search experience for all users. 

Search applications can be transformed through document processing, including: 

  • Greatly improved relevancy ranking
  • Providing users with extended search navigation options
  • Customizing the search application to the specific needs of the audience
  • Improving search speed through removing unnecessary data clutter before indexing

Unfortunately, the processing and preparation of documents prior to indexing by a search engine is one of the most neglected aspects of search project implementations, and yet it is one of the most important factors affecting the ultimate success of search-based applications. 

The Document Processing Methodology for Search (DPMS) 
User expectations of modern search-based applications require new implementation techniques, architectures and development processes to achieve success within reasonable costs and timeframes. This is the goal of Search Technologies’ Document Processing Methodology for Search, or DPMS. 

DPMS is a collection of tools, techniques, and processes that prepare data for use in search-based applications. The majority of business-critical search applications can be significantly enhanced, or even transformed by better document processing prior to indexing.

As a methodology, DPMS was born directly because of the number of times that Search Technologies was called into search projects, only to discover that "standard" software techniques, development processes and architectures were insufficiently flexible or transparent to handle the demands of a large-scale search engine implementation. 

The use of DPMS in those implementations serves to reduce data complexity and to make the search application more maintainable as well as more flexible. DPMS also helps search to deliver a significantly better user-experience in a controllable, cost-effective way.

Read a full description of DPMS