locked
Search only crawling pages - not crawling all list/library content RRS feed

  • Question

  • In some of our SP2010 environments, we're finding the Sharepoint crawler is only able to crawl a small set of content, but not does not completely crawl all of the lists/libraries that we really need it to.  With what appear to be identical settings, we have this working on other servers, we are just unable to determine why on some servers this does not work.

    We've tried many things from checking rules to using alternate access mappings, however the issue remains unresolved. There are no related errors show in event viewer or sharepoint logs.

    We noticed a few things in fiddler that are interesting:
    On servers where everthing works, there are a large amount of requests to the web service (sitedata.asmx) - which is how sharepoint is getting the lists/library data and URLs
    On the servers not working correctly, no requests are made to sitedata.asmx, but we do see a single request to spdisco.aspx, and even stranger we see random requests to  "<site path>/pages/none" for each page library in the site hierarch (not blocked by rules).

    Anyone ever see something like this?
    It seems that sharepoint can't or doesn't recognize our content source url as being a sharepoint site collection?

    Friday, February 17, 2012 5:25 PM

Answers

  • Update:

    Root cause found - and wanted to provide an update in case anyone encounters an issue like this.

    It turns out our site was failing to crawl due to http headers being overwritten.  Somewhere in the history of our site a developer turned IE8 Compatibility mode on, and in doing so caused "inherited" http headers that identified the site as a Sharepoint site to be removed.  When search crawls your site, it looks for this header before crawling your site content via the web service as described above.  If you see an issue like this, make sure your web pages are returning a response that contains the correct header:

    MicrosoftSharePointTeamServices: <version>


    • Marked as answer by Tfrey Friday, April 20, 2012 7:01 PM
    Friday, April 20, 2012 7:01 PM

All replies

  • We experienced similar behavior once but that was when we were trying to bring in content in bulk from a an external source. The issue was resolved by fixing the import script.

    Also its a no brainer but do check your AAMs for the web application hosting the site collection.


    I want to believe

    • Marked as answer by Rock Wang– MSFT Saturday, February 25, 2012 6:47 AM
    • Unmarked as answer by Tfrey Friday, April 20, 2012 7:01 PM
    Friday, February 17, 2012 7:41 PM
  • Update:

    Root cause found - and wanted to provide an update in case anyone encounters an issue like this.

    It turns out our site was failing to crawl due to http headers being overwritten.  Somewhere in the history of our site a developer turned IE8 Compatibility mode on, and in doing so caused "inherited" http headers that identified the site as a Sharepoint site to be removed.  When search crawls your site, it looks for this header before crawling your site content via the web service as described above.  If you see an issue like this, make sure your web pages are returning a response that contains the correct header:

    MicrosoftSharePointTeamServices: <version>


    • Marked as answer by Tfrey Friday, April 20, 2012 7:01 PM
    Friday, April 20, 2012 7:01 PM
  • This was the exact issue for us. We added header back to the http response header section in IIS 7.5 for web application and our problem was resolved. 

    MicrosoftSharePointTeamServices: <version>

    Thank you. 

    Friday, November 15, 2013 9:21 PM
  • A HUGE Thanks a couple of years later - this solved the same issue we were having in SharePoint 2013 on an external anonymous site.  Adding the header back in allowed the crawl to work correctly!

    Thanks for documenting this!

    Bill

    Thursday, July 31, 2014 2:50 AM