Back to top

Semantic Search - Extracting Insights from Unstructured Text


Semantic Search is enabled by a range of content processing techniques that identify and extract entities, facts, attributes, concepts and events to populate meta-data fields. The purpose of this is to enable the analysis of unstructured content in the enterprise. 

Bottom line, the semantic analysis of unstructured data is an important technique for "structuring the unstructured," without which, big data applications cannot deliver actionable intelligence.

Further, the accuracy of semantic search is critical. Without appropriate accuracy and provenance, you run the risk of feeding decision makers with non-actionable or even misleading insight.


Semantic extraction techniques are used to extract unstructured content and usually based on one of two approaches (or a combination of the two):

  • Rule-based matching. Similar to entity extraction, this approach requires the support of one or more vocabularies
  • Machine-learning. A statistical analysis of the content, a potentially compute-intensive application that can benefit from using Hadoop, if the data set is substantial. This approach derives relationships from statistical co-occurrence within the document corpus
  • Hybrid solutions. Statistically driven, but enhanced by a vocabulary. This is typically the best approach if the content set is focused on a specific subject area

Technologies for supporting and improving semantic search:

  • Aspire, Search Technologies' award-winning content processing platform, supports all of these approaches. Its role is to fully prepare unstructured data, including parsing, cleansing, normalization, filtering and semantic analysis, for use in search or analytics projects at any scale, including big data applications.
  • Saga Natural Language Understanding (NLU): enables non-data scientists to create and maintain powerful, flexible, tested, and scalable enterprise language models for user interaction and document understanding. It incorporates many language modeling techniques and machine learning into a single, user-friendly semantic framework to handle a wide variety of natural language use cases.

For further information, or for a no-commitments discussion of semantic search with one of our experts, contact us.