Back to top

Aspire Content Processing

Powerful, Flexible Content Processing for Unstructured Data


Poor quality content, especially metadata, is a leading cause of user dissatisfaction and underperformance in search applications. Diligent preprocessing to prepare unstructured content prior to indexing is a critical yet often neglected aspect when building a search system. 

Aspire Content Processing is an innovative and powerful framework specifically designed for unstructured data. It is part of our collection of search engine independent technology assets that help organizations optimize their search and big data architectures. Aspire Content Processing:

  • Enables content from across the enterprise to be securely accessed via our 40+ connectors, cleaned, normalized and enriched to a consistently high standard, enabling search systems and analysis applications to perform optimally.
  • Is the foundation for a flexible, reliable, and maintainable approach to the development of content processing solutions, ensuring that search systems are accurate and effective while the total cost of ownership is kept under control.
  • Is search engine independent, allowing your enterprise search architecture to be future-proof.
  • Can be used with Hadoop to tackle computationally large text analytics tasks.
  • Is standards-based, using Java, OSGi, Apache Tika, Zookeeper, Maven, Groovy, and other proven open source technologies.

Aspire Content Processing can be deployed efficiently within our optimized reference architecture. See our Technology Overview page for more details.


Aspire is used in many government and corporate search implementations, addressing a wide range of applications, including enterprise search, e-commerce search, government portals, publisher's websites, and compliance applications.

  • Read about how Aspire for Big Data helped ingest over 1 Petabyte of unstructured content in our customer project.
  • Watch how combining Aspire with Azure Cognitive Services helps accelerate unstructured content acquisition, processing, and enrichment. 


  • Aspire Basic: a paid version that contains our File System, RDBMS, Aspire Web Crawler, FTP, and RSS connectors. It also contains all of our publishers. The Aspire Basic license allows for systems containing up to one million documents.
  • Aspire Enterprise: a paid version that has all the connectors included the Basic version, but can scale for extremely large amounts of content (unlimited number of documents).

Contact us for pricing and technical specifications.


Request More Details