Back to top

Data Preparation for the Google Search Appliance

Even the Most Complex Data Sets Can Leverage the GSA's Key Strengths

Some data sets are more complicated than others. The Google Search Appliance plugs directly to many data sources. However, where the data is complex, pre-processing holds the key to getting the best from the GSA.

Our Aspire Content Processing framework is ideally suited to this task and along with our implementation and training services, it provides a proven, cost-effective solution to complex data issues.



Here are four recent customer examples:

  • Selective Indexing: The GSA is used with a large third-party product catalog by this e-Commerce customer. However, not all of the records in the third-party dataset are relevant, and within each record, some of the fields are not applicable to the application. Aspire is used to filter the data and ensure that only appropriate product information is included in the search experience.
  • Generating Metadata for Dynamic Navigation: This important feature of the GSA is widely used and highly beneficial to the search experience, especially where navigators are contextual to the users' needs. Contextual metadata seldom exists in the original content and needs to be auto-generated during the indexing process. Aspire provides this capability for the GSA. Contextual navigation is an extremely popular feature among search users.
  • Splitting Content: This customer's data set includes hundreds of very large PDF files containing technical information. The search experience is enhanced by automatically splitting these large files into "sensibly indexable chunks" before feeding them to the GSA.
  • Joining or Aggregating Content: Search usefulness is enhanced for this customer by joining data from three different sources. "Virtual Documents" are created for the purpose of search and fed to the GSA. Virtual documents comprise:
    • Office files, providing the main body content
    • Two separate databases (controlled by different business applications) providing valuable metadata for each document to drive Dynamic Navigation capabilities and ensure relevancy

For all of these use-cases and more, Aspire's provides a cost effective solution for indexing complex data sets using the Google Search Appliance.

Contact us for more information.