none
Unable to Crawl a content source (MOSS2007) RRS feed

  • Question

  • I have recently configured search in my SP environment having 2 instances. One instance is running fine.

    I am facing issues on second instance which is single SP server with 1 SQL server.

     

    Issue - I get the below mnetioned error whenever I try to crawl my content source with my web application (we have only 1 application in this instance) although SSP and another test application is crawling fine. The different between test and the questioned application is test is using not using any host header while my application has a host header.

    --------------------------------------------------------------------

    acEvent Type: Warning
    Event Source: Office Server Search
    Event Category: Gatherer
    Event ID: 2436
    Date:  31/10/2011
    Time:  14:28:00
    User:  N/A
    Computer: XYZKN1BW
    Description:
    The start address <xxxxxxxxx> cannot be crawled.

    Context: Application 'SSP_Stage', Catalog 'Portal_Content'

    Details:
     The specified address was excluded from the index. The crawl rules may have to be modified to include this address.

    -----------------------------------------------------------------------

    I have already tried some resolutions (refering google)

    1 - Disable loop back (registry)

    2 - checked Default content account

    3 - No changes to permissions for this account

    4 - checked that txt file type is not blocked 

    Any help/suggestion is appreciated...!!


    Shiv Yadav @ MSCA
    • Edited by Shiv Yadav Monday, October 31, 2011 8:27 PM
    Monday, October 31, 2011 8:25 PM

Answers

  • What is the authentication used for the site...is that Basic Authentication?

    Extend the site on some other port. Add the extended site to AAM. Perform a full crawl


    Regards, Vishwas
    • Edited by Vishwas B Monday, November 7, 2011 1:30 AM modified
    • Marked as answer by Qiao WeiModerator Friday, November 11, 2011 11:37 AM
    Monday, November 7, 2011 1:30 AM

All replies

  • Hi Shiv,

    What is the URL of the site?  is it with SSL - https or http ?

    If its with https make sure to add URL with HTTP and also check AAM settings. check default zone URL, it should be with http.

    please let us know if this helps.

    thanks.


    Warm Regards, Pratik Vyas | SharePoint Consultant | http://sharepointpratik.blogspot.com/ | Posting is provided "AS IS" with no warranties, and confers no rights
    Tuesday, November 1, 2011 3:14 AM
  • Hi,

    this particular application is secured via SSL and ADFS (Https://xxx.xxxxx.com), I have added it in content source as http://xxx.xxxxx.com but getting two different errors as mentioned below:

    Crawl logs:

    earlier, I was getting (The specified address was excluded from the index. The crawl rules may have to be modified to include this address. (The item was deleted because it was either not found or the crawler was denied access to it.)

    Now, I am getting (Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to crawl this content.)

    Event viewer:

    The start address <http://xxx.xxxxx.com> cannot be crawled.

    Context: Application 'SSP_application', Catalog 'Portal_Content'

    Details:

    Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has "Full Read" permissions on the SharePoint Web Application being crawled. (0x80041205)


    Shiv Yadav @ MSCA
    Tuesday, November 1, 2011 9:06 AM
  • I am having the same exact problem. Search was working fine until I created a new https web application and now I am getting the error under SSP search log:

    "Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to this content. (The item was deleted because it was either not found or the crawler was denied access to it.)

    And corresponding event id: 2436 in the Application Log.

    Source: Windows SharePoint Server, Category: Gatherer, Type: Warning

    -----------------------------------------------------------------------------------------------------------------

    Like Shiv I have tried:

    1 - Disable loop back (registry)

    2 - checked Default content account (logged in to site as default content account to verify)

    3 - No changes to permissions for this account (account has limited permssions per best practices)

    4- Created new SSP and tried using it for search.

    5- Changed the default AAM zone to http for my https site and added an intranet zone for https. (all other sites only have default zones and are http only).

    -----------------------------------------------------------------------------------------------------------------

    From what I read online this error can be caused by an issue with AAM but I have not been able to resolve it. I am hoping to be able to figure it out without calling MS Support.

     

    Thanks!

    Kevin

     

    Tuesday, November 1, 2011 2:05 PM
  • One more thing to add.....I tried to create one more application and crawl it. It is working fine. All other applications are working fine apart the one that I need.

    The difference between them and this one is they all read like server name followed by port# but this one has a proper site address say https://abc.domain.com/ (host header, I think)......

    Now, I am running near to the deadline...ahhh


    Shiv Yadav @ MSCA
    Tuesday, November 1, 2011 3:01 PM
  • What is the authentication used for the site...is that Basic Authentication?

    Extend the site on some other port. Add the extended site to AAM. Perform a full crawl


    Regards, Vishwas
    • Edited by Vishwas B Monday, November 7, 2011 1:30 AM modified
    • Marked as answer by Qiao WeiModerator Friday, November 11, 2011 11:37 AM
    Monday, November 7, 2011 1:30 AM