Aspire Content Processing for Corporate Wide Search
Aspire Content Processing can be used for company-wide search applications involving multiple, heterogenous data sets. It:
- Improves search results, regardless of which search engine you use
- Eliminates corporate data “silos” but preserves document-level security
- Handles “Big Data” and scattered repositories in large corporations
- Easily manages content updates, regardless of schedule frequency
- Allows for flexibility, customization, and future growth
Aspire content processing works with leading search engines including Microsoft SharePoint and FAST, the Google Search Appliance (GSA), Amazon CloudSearch, Solr Lucene, and Elasticsearch.
Aspire Content Processing Improves Search Results
Most commercial search engines are capable of producing good search results. The primary reason that end users don’t see those good results is the poor quality of the data being fed into the search engine. Some engines don’t address content processing issues at all. Others try to compensate with complex pipelines that require custom coding and significant updating every time there’s a change or a new data source. That’s an expensive and time-consuming solution. Aspire eliminates those problems through automated metadata extraction and manipulation that’s done outside of the engine, and via configuration instead of coding. You may recognize some of the following:
- Can your users search by author, date, category, publication number, or chapter?
- Do your query results include dynamic navigators that guide your users to where the most relevant documents are located?
- Does a fielded search find “Smith, John” and “John Smith” and recognize that it’s the same person?
- Do your query results favor one repository over another, even though they both contain equally relevant documents?
- Are your search results cluttered up by meaningless but frequently appearing terms that skews results?
Aspire solves these and other common problems.
Aspire Content Processing Eliminates Corporate Data “Silos”
In most corporations, there are many different types of content repositories, often “owned” by individual departments and with a unique search application designed for use by that department. But much of the information has relevance across departments and could be more efficiently managed centrally, rather than by duplicating effort in every department. Consolidation has been avoided because there is no easy way to ingest the content from dozens of different repositories and feed it into a single search engine, and no easy way to replicate existing security schemes in order to preserve document-level permissions across the organization. Aspire:
- Contains built-in connectors to ingest content from corporate file systems, SharePoint repositories, Documentum repositories and others
- Contains an LDAP proxy server (Apache DS) for user authentication
- Preserves document-level security using existing security models, including group expansion (hierarchical permissions) for LDAP, Active Directory, SharePoint, and Documentum
- Can normally include pre-binding of document ACLs, translating to faster performance as permissions are indexed along with the document and do not have to be looked up at query time for each returned document
- Can be used to build additional connectors of many types (including, for example, those we have recently built for Alfresco, Liferay, RightNow, IBM Connections, and Confluence)