Search for SharePoint 2013
A very capable search engine is now fully embedded into SharePoint 2013
This article provides a high-level summary of SharePoint 2013’s search capabilities.
- FAST-like search technology (acquired by Microsoft in 2008), is at the heart of SharePoint 2013, although the code is in fact a new rewrite
- Technologies and ideas from Bing have been added to the mix, to provide a set of enterprise search capabilities, with some room for customization
- It includes a rules-based query parsing framework
- For searching content held within the SharePoint 2013 environment, it has everything you will need
- Add-on capabilities and expertise are generally necessary for enterprise search projects using SharePoint 2013 search.
A BRIEF FUNCTIONAL WALK-THROUGH
We’ll follow the flow of documents for this walk-through, starting with crawling and finishing with search functionality provided to users.
No surprises here. It is encouraging to see Microsoft literature refer explicitly to capturing metadata associated with documents, and not just ingesting the document itself.
No processing is done at this stage, the content is simply acquired. Out-of-the-box connectors will be available for SharePoint (including People Profiles), HTTP, File Shares, and based on the BDC (Business Data Connectivity) framework, Exchange public folders, Documentum and Lotus Notes. A wide variety of other connectors will become available from 3rd parties in due course. For example, Search Technologies' Aspire Data Connectors are fully compatible with SharePoint 2013.
Content Processing and Enrichment
Aka the indexing pipeline. It is important to note that SharePoint 2013 does not offer equivalent functionality to FAST ESP or FS4SP, although it provides sufficient functionality for most scenarios where all of the content is held in SharePoint.
Entity extractors, commonly used with SharePoint 2010 implementations to create custom refiners (search navigators) are retained. However, SharePoint 2013 search has no native categorization capability. This means that for many organizations, the "Web Services Call Out", a pipeline module designed to enable 3rd party technologies to provide additional content enrichment, will be very important. FAST Search for SharePoint 2010 has a similar capability.
For example, Search Technologies provides bolt-on technologies for SharePoint Content Enrichment generally, and for SharePoint 2013 Categorization specifically, using the Web Services Call Out.
The content processing component also writes information to a “link database”. This information can be subsequently used by the analysis processing component to calculate link popularity statistics and provide relevancy weighting possibilities. Anchor text within links can also contribute to page content for ranking purposes. These are core techniques used by Google and Bing out on the Web. Applying them to private data sets (where most documents are not interlinked) will need to be done with some thought, but never-the-less, it is good to see these capabilities in SharePoint 2013.
The Index Component and its architectural possibilities look FAST-like, with the ability to partition indexes into rows and columns to scale both data volumes and query loads. Low latency is also said to be possible – so content changes can be reflected in the indexes within seconds.
Query processing is an important (and sometimes neglected) part of enterprise search. In SharePoint 2013, a new component provides options to enhance the user’s query to improve precision, recall and relevancy. Basic functions such as query spelling correction suggestions are included. “Query rules” can be applied, for example, to have a search request automatically trigger particular settings, to spawn multiple queries which are then aggregated, or to trigger a “best bet” (now called a “promoted result” in SharePoint 2013).
“Query Conditions” that cause rules to be triggered include the presence of specified words in the query. The Query Processing component also enables results sets returned by the search engine to be processed, according to rules, before being displayed to the user. Applications for this include additional search-time security trimming.
Different ranking models can be set up to suit different audiences, and these can be chosen at query time.