Search for SharePoint 2013
The FAST technology is now fully embedded into SharePoint 2013
This article provides a high-level summary of SharePoint 2013’s search capabilities.
- FAST-like search technology, acquired by Microsoft in 2008, is at the heart of SharePoint 2013 (although the code is in fact a new rewrite)
- Technologies and ideas from Bing and elsewhere have been added to the mix to provide a comprehensive set of enterprise search capabilities, with room for customization
- It includes a rules-based query parsing framework
A BRIEF FUNCTIONAL WALK-THROUGH
We’ll follow the flow of documents for this walk-through, starting with crawling and finishing with search functionality provided to users.
No surprises here. It is encouraging to see Microsoft literature refer explicitly to capturing metadata associated with documents, and not just ingesting the document itself.
No processing is done at this stage, the content is simply acquired. Out-of-the-box connectors will be available for SharePoint (including People Profiles), HTTP, File Shares, and based on the BDC (Business Data Connectivity) framework, Exchange public folders, Documentum and Lotus Notes. A wide variety of other connectors will become available from 3rd parties in due course. For example, Search Technologies' Aspire Data Connectors are fully compatible with SharePoint 2013.
Aka the indexing pipeline. In SharePoint 2013, this resembles the old FAST pipeline and looks to have a few important features. However, it is important to note that SharePoint 2013 does not offer equivalent functionality to FAST ESP or FS4SP. The content processing component also writes information to a “link database”. This information can be subsequently used by the analysis processing component to calculate link popularity statistics and provide relevancy weighting possibilities. Anchor text within links can also contribute to page content for ranking purposes. These are core techniques used by Google and Bing out on the Web. Applying them to private data sets (where most documents are not interlinked) will need to be done with some thought, but never-the-less, it is good to see that these capabilities have been carried over into SharePoint 2013.
Entity extractors, commonly used with SharePoint 2010 implementations to create custom refiners (search navigators) are retained. However, SharePoint 2013 search has no native categorization capability. This means that for many organizations, the "Web Services Call Out", a pipeline module designed to enable 3rd party technologies to provide additional content enrichment, will be very important. FAST Search for SharePoint 2010 has a similar capability.
For example, Search Technologies provides bolt-on technologies for SharePoint Content Enrichment generally, and for SharePoint 2013 Categorization specifically, using the Web Services Call Out.
This component enables additional context to be introduced during the indexing process, which can later be used to customize relevancy ranking and for other purposes. Input to the analysis process is provided by the Link Database, describing how documents are linked, and from search and user behavior analytics. The latter enables, for example, "popular documents" to be promoted up the results list.
The Index Component and its architectural possibilities look FAST-like, with the ability to partition indexes into rows and columns to scale both data volumes and query loads. Low latency is also said to be possible – so content changes can be reflected in the indexes within seconds.
Query processing is an important (and sometimes neglected) part of enterprise search. In SharePoint 2013, a new component provides options to enhance the user’s query to improve precision, recall and relevancy. Basic functions such as query spelling correction suggestions are included. “Query rules” can be applied, for example, to have a search request automatically trigger particular settings, to spawn multiple queries which are then aggregated, or to trigger a “best bet” (now called a “promoted result” in SharePoint 2013).
“Query Conditions” that cause rules to be triggered include the presence of specified words in the query. The Query Processing component also enables results sets returned by the search engine to be processed, according to rules, before being displayed to the user. Applications for this include additional search-time security trimming.
Different ranking models can be set up to suit different audiences, and these can be chosen at query time.
At Search Technologies, we are now fully engaged with a number of customers looking to upgrade early to SharePoint 2013. Watch this space for further updates.