locked
Seacrh Crawl exclusions RRS feed

  • Question

  • Does MS or anyone here reccomend that the home or root page of a crawl source be excluded from the crawl.

    I recall while working at company where the main page(home) was excluded from the crawl due to the fact that if it failed to crawl or the crawlwer didnt find it, all pages below the page not found by the gatherer would not get indexed for that run

    Is this accurate?

     

    Friday, November 4, 2011 9:54 PM

All replies

  • Hi,

    Please explain "root page of crawl source".

    In our environment we are crawling all sites and pages excluding central admin and ssp.


    Thanks, Rahul Rashu
    Saturday, November 5, 2011 6:38 AM
  • Hi,

     

    You can check that if you have added the URL of this root site to AAM.

     

    Please go to central administration>options>alternate access mappings to check this.

     

    Hope this helps.

     

    Thanks,

    Pengyu Zhao

    Monday, November 7, 2011 5:41 AM
  • So you should exclude the root site?

     

    I mean the default.aspx of a large site colletion, logic being if that page cant be crawled, IIS reset, ect then all pages below it would fail?

     

    Monday, November 7, 2011 3:08 PM