Semantic Search - Extracting Insights from Unstructured Text
Semantic Search is enabled by a range of content processing techniques that identify and extract entities, facts, attributes, concepts and events to populate meta-data fields. The purpose of this is to enable the analysis of unstructured content in the enterprise.
Bottom line, the semantic analysis of unstructured data is an important technique for "structuring the unstructured," without which, big data applications cannot deliver actionable intelligence.
Further, the accuracy of semantic search is critical. Without appropriate accuracy and provenance, you run the risk of feeding decision makers with non-actionable or even misleading insight.
UNSTRUCTURED CONTENT EXTRACTION
Semantic extraction techniques are used to extract unstructured content and usually based on one of two approaches (or a combination of the two):
- Rule-based matching. Similar to entity extraction, this approach requires the support of one or more vocabularies
- Machine-learning. A statistical analysis of the content, a potentially compute-intensive application that can benefit from using Hadoop, if the data set is substantial. This approach derives relationships from statistical co-occurrence within the document corpus
- Hybrid solutions. Statistically driven, but enhanced by a vocabulary. This is typically the best approach if the content set is focused on a specific subject area
Aspire, Search Technologies' award-winning content processing platform, supports all of these approaches. Its role is to fully prepare unstructured data, including parsing, cleansing, normalization, filtering and semantic analysis, for use in search or analytics projects at any scale, including big data applications.
For further information, or for a no-commitments discussion of semantic search with one of our experts, contact us.