With the new version of SharePoint a new type of crawl appeared in 2013 named « Continuous Crawl ».

For old schools like me, in SharePoint 2010 we had 2 crawls available and it was configurable on our Search Service Application.

  • Full: Crawl all content,
  • Incremental: As the name says, it crawls content that has been modified since the last crawl.

The disadvantage of these crawls, is that once launched, you are not able to launch a second crawl in parallel (on the same content source), and therefore for the content changed in the meantime we will need to wait until the current crawl is finished (crawl and another) to be integrated into the index, and therefore to be found via search.

An example :

  • A incremental crawl named ALFA is started and will last 50 take minutes,
  • After 10 minutes of crawling a new document has been added, so we need a second incremental crawl named BETA to get the document in the index.
  • This item will have to wait at least 40 minutes to be integrated into the index.

 So, we can’t keep an updated index with the latest changes, because latency is invited in each crawling process.

It is possible that in most of cases this operation is suitable and favorable for your clients, but for those who want to search their content immediately or after their integration into SharePoint there is now a new solution in SharePoint: "Continuous Crawl".

The Continuous Crawl

So resuming: The "Continuous Crawl" is a type of crawl that aims to maintain the index as current as possible.

It's operation is simple: once activated, it will launch the crawl at regular intervals. The major difference with incremental crawl is that the crawl can run in parallel, and does not expect the previous crawl to complete prior the launch.

Important Points:

  • "Continuous Crawl" is only available for sources of content type "SharePoint Sites"
  •  By default, a new crawl is run every once in 15 minutes, but the SharePoint administrator can change this interval using the PowerShell cmdlet Set-SPEnterpriseSearchCrawlContentSource,
  • Once started, a "Continuous Crawl" can’t be paused or stopped, you can just disable it.

If we take our example above with "Continuous Crawl":

  •  Our ALFA crawl starts and will take at least 50 minutes,
  •  After 10 minutes of crawling an item already crawl is hereby amended, and requires a new crawl.
  •  Crawl "BETA" is launched,
  •  The crawl "BETA" starts in (15-10) minutes,
  •  Therefore this item will not need to wait 5 minutes (instead of 50 minutes) to be integrated into the index.

1- How to Enable it?

In Central Administration, click on "Search Service Application", and then in the menu, click on the "Content Sources". 

Click on "New Content Source" at the menu

 

Chose "SharePoint Sites"

Select "Enable Continuous Crawls"

 

  • The content source has been created so we can see the status on as "Crawling Continuous"

 2 - How to disable it?

  • From the content source page, chose the option "Enable Incremental Crawls" option. This will disable the continuous crawl.
  • Save changes.

 3 - How to see if it works ?

  • Click on your service application search then "Crawl Log" in the section "Diagnostics".
  • Select your Content Source and click on "View crawl history"
  • Or via PowerShell Execute the following cmdlets 
  • $SearchSA = «Search Service»
    • Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $SearchSA | select *

Impact on our Servers

The impact of a "Continuous Crawl" is the same as an "Incremental Crawl".

At the parallel execution of crawls, the "Continuous Crawl" crawls within the parameters defined in the "Crawler Impact Rule" which controls the maximum number of requests that can be executed by the server (default 8).

Note: this setting does not restrict the Content Processing component, only the rate at which links are added to the Crawl Queue.


Content Processing uses 3 threads per core by default (called Processing Flows). To restrict Content Processing impact, use ProwerShell to set the NumberOfCssFeedersPerCPUForRegularCrawl property on the Search Service Application object.


See http://blogs.technet.com/b/searchguys/archive/2013/02/19/content-processing-performance-scaling.aspx


4 - SharePoint Online

 This feature is available in SharePoint Online 2013 (Office 365). You can read it here: http://technet.microsoft.com/en-us/library/jj819291.aspx