Back to top

Aspire Content Processing

Powerful, Flexible Content Processing for Unstructured Data


Poor quality content, especially metadata, is a leading cause of user dissatisfaction and underperformance in search applications. Diligent preprocessing to prepare unstructured content prior to indexing is a critical yet often neglected aspect when building a search system. 

Aspire Content Processing is an innovative and powerful framework specifically designed for unstructured data. It is part of our collection of search engine independent technology assets that help organisations to optimise their search and big data architectures. Aspire Content Processing:

  • Enables content from across the enterprise to be securely accessed, cleaned, normalised and enriched to a consistently high standard, enabling search systems and analysis applications to perform optimally.
  • Is the foundation for a flexible, reliable and maintainable approach to the development of content processing solutions, ensuring that search systems are accurate and effective while the total cost of ownership is kept under control.
  • Is search engine independent, allowing your enterprise search architecture to be future-proof.
  • Can be used with Hadoop to tackle computationally large text analytics tasks.
  • Is standards-based, using Java, OSGi, Apache Tika, Zookeeper, Maven, Groovy, and other proven open source technologies.

Aspire Content Processing can be deployed efficiently within Search Technologies' optimised reference architecture. See our Technology Overview page for more details.


Aspire is used by many government departments and organisations, addressing a wide range of applications including enterprise search, e-commerce search, government portals, publisher's websites, and compliance applications.

  • Watch how combining Aspire with Azure Cognitive Services helps to accelerate unstructured content acquisition, processing, and enrichment. 
  • Read about how Aspire for Big Data helped to ingest over 1 Petabyte of unstructured content in our customer project.


  • Aspire Basic: a paid version that contains our File System, RDBMS, Aspire Web Crawler, FTP, and RSS connectors. It also contains all of our publishers. The Aspire Basic license allows for systems containing up to one million documents.
  • Aspire Enterprise: a paid version that has all the connectors included the Basic version, but can scale for extremely large amounts of content (unlimited number of documents).

Contact us for details and pricing.