locked
Create a Content Source for only one HNSC Site Collection RRS feed

  • Question

  • Hi there

    I have a curious issue with Search and a HNSC implementation.
    I have a HNSC Site Collection I need to setup Search to crawl more regularly than the rest.
    I created a new Content Source, and just added in the URL for the HNSC Site Collection.
    Now, on viewing the Crawl Logs, I see:
    "This URL is part of a host header SharePoint deployment and the search application is not configured to crawl individual host header sites. This will be crawled as a part of the host header Web application if configured as a start address."

    So far all I can see online is guidance stating that you need to add the HNSC Web App's URL into the content source; and then it will crawl all in that Web App - but that would mean it is an All-Or-Nothing approach!! I don't seem to be able to see any way to configure different Content Sources, and different Schedules, for certain Site Collections only in a HNSC implementation - appears to be a huge shortfall for going the HNSC route if that's the case!! Has anyone dealt with this before; and possibly found a workaround?

    Thursday, June 4, 2015 3:35 PM

Answers

  • Hi,

    I have done a test in my SharePoint 2013 and reproduced your issue.

    Unfortunately, I am afraid there is no way that we could setup different HNSC Site Collections to run under different crawl schedules.

    Best Regards,

    Lisa Chen


    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com.


    Thursday, June 11, 2015 2:25 AM

All replies

  • Hi,

    We are currently looking into this issue and will give you an update as soon as possible.
     
    Thank you for your understanding and support.

    Best Regards,

    Lisa Chen


    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com.

    Saturday, June 6, 2015 6:37 AM
  • Hi,

    From your description, my understanding is that you want to crawl only one HNSC Site Collection which is used more regularly.

    When you create a new content source and just add in the URL for the HNSC Site Collection. you get an error.

    I suggest you create a new content source with "Web Sites" content source type, then just add in the URL for the HNSC Site Collection.

    Besides, take a look at the similar issue post:
    https://social.technet.microsoft.com/Forums/sharepoint/en-US/d6c35a5d-d8c8-4e6b-a375-7a70366c9994/only-crawl-one-site-collection?forum=sharepointsearchlegacy

    https://social.technet.microsoft.com/Forums/office/en-US/086507eb-e371-4d16-9eb5-43dc8be3ad6f/search-only-one-site-collection?forum=sharepointsearchlegacy

    Hope this helps.

    Best Regards,

    Lisa Chen 


    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com.

    Tuesday, June 9, 2015 8:12 AM
  • Hi Lisa

    Thanks for your response.

    I tried creating a new Content Source, of type Web Site, and applying the required addresses. This however gave the same result - seemingly, SharePoint is able to tell this is a local SharePoint address and treats it the same as before, giving the same error as before:

    "This URL is part of a host header SharePoint deployment and the search application is not configured to crawl individual host header sites. This will be crawled as a part of the host header Web application if configured as a start address."

    I reviewed your 2 links - neither appear to relate to HNSC Site Collections, which is where the issue is (I have no problem setting up new Content Sources for non-HNSC Site Collections).

    Any thoughts would be greatly appreciated.

    Thanks

    David

    Tuesday, June 9, 2015 8:44 AM
  • As far as I know you're limited to one search rate for a web application. That may be an issue if you've got a SC that needs to be crawled at a very high frequency and other SCs that have large amounts of change but don't need to be as fresh. However normally using something like continuous crawls means that most farms can be kept fresh enough.
    Tuesday, June 9, 2015 8:57 AM
  • Hi Alex

    Thanks for the response; Continuous Crawl is not always suitable due to the weight it brings.

    The issue only exists for HNSC implementations; there is no issue in non-HNSC implementations with creating 2 Content Sources, and adding one Site Collection from that Web App to the first Content Source, and another Site Collection from the same Web App, to a second Content Source.

    Regards

    David

    Tuesday, June 9, 2015 1:10 PM
  • Hi,

    From your description, I have done a test in my SharePoint and I got the same issue with you:

    The only solution I found was to put the prioritized site collection in one content source and the other site collections in another one. The downside of this is that each time we create a new site collection we have to add it to the content source.

    Best Regards,

    Lisa Chen


    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com.

    Wednesday, June 10, 2015 7:13 AM
  • Hi Lisa

    I would be fine with that approach if it worked - if you are testing with Site Collections in a HNSC Web App, you will see that if you removed ALL Content Sources, then just added 2 new Content Sources, one for Site Collection1 that is in the HNSC Web App, and a second for Site Collection2 that is also in the HNSC Web App.

    Now, when you try and do a Full Crawl, you will see there are no Success or Error items, just a single entry for each of those Scopes in the Crawl Log, each stating:

    "This URL is part of a host header SharePoint deployment and the search application is not configured to crawl individual host header sites. This will be crawled as a part of the host header Web application if configured as a start address"

    So far I still know of no way, manual or not, to setup different HNSC Site Collections to run under different crawl schedules?

    Thanks again for your consideration

    David

    Wednesday, June 10, 2015 8:50 AM
  • Hi,

    I have done a test in my SharePoint 2013 and reproduced your issue.

    Unfortunately, I am afraid there is no way that we could setup different HNSC Site Collections to run under different crawl schedules.

    Best Regards,

    Lisa Chen


    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com.


    Thursday, June 11, 2015 2:25 AM