Office Search Service is working on all web applications, except one. RRS feed

  • Question

  • Hi,

    We have a WSS 3.0 Farm with Office Search Service.

    One WFE, one Search Server and one database server.

    We have 10 web apps and, only in one, Search is not working.

    I ran a full crawl on the nine that work and they all finished.

    The one that doesnt work is by far the biggest one. The last full crawl took 300 hours but apparently it didn't work, although 98% of the logs are successfull (see some log error messages bellow)

    The content database is split in 6 mdfs and the largest one has 240GB.

    The strange thing is that Search use to work untill last week except for a few site collections.

    We were trying to fix the problem on those site collections and now it doesnt work on the whole web app, but it still works on the other nine web apps.

    Our Default Content Database Account is called SPUSER and, even though it is set to site colleciton admin to any site collection it doesn't get the full control access to the content. On the Web Applications that work it has FULL CONTROL.

    Is there something such as web app admin? 

    We checked User Mappings privieges in the database and it has dbowner on all of them.

    How can the service work in 9 web applications and not work in one in the same Farm?

    The Crawl ran for 3 hours and I estimate it will take 16 hours

    So far I have:

    Successfully Crawled 527241
    Warnings 6757
    Errors 57 * see bellow
    Total 534055 (The full crawl has 2,5 million items - it is still running) 

    If I nagivate to the site collections marked as successfull I still cant find anything in its search.

    On the other web apps search starts working while crawling is running and I start finding things within minutes.

    Please help

    Thank you very much

    Crawling this document could not be completed because the remote server did not respond within the specified timeout. Try crawling the server later, or increase the timeout value. You may also try scheduling crawls during off-peak usage times.
    Error in the Site Data Web Service.
    Error in the Site Data Web Service. (*** Client found response content type of 'text/html; charset=utf-8', but expected 'text/xml'. The request failed with the error message: -- <HTML dir="ltr"> <HEAD><meta name="GENERATOR" content="Microsoft SharePoint" /><meta name="progid" content="SharePoint.WebPartPage.Document" /><meta HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8" /><meta HTTP-EQUIV="Expires" content="0" /><meta name="ROBOTS" content="NOHTMLINDEX" /><title> Error </title><link rel="stylesheet" type="text/css" )
    Exception from HRESULT: 0x81020071 (O acesso a este site foi bloqueado. Contate o administrador para solucionar o problema.)
    Exception from HRESULT: 0x81020071 (Access to this site was blocked
    Exception from HRESULT: 0x81070504 (Web doesnt exist  "/academiadlo".)
    Exception from HRESULT: 0x81070504 (Web doesnt exist "/celula535".)
    One of the fields on the SharePoint List does not exist or has become corrupt.
    The docfile has been corrupted.
    The filename or extension is too long.
    The filtering process could not be initialized. Verify that the file extension is a known type and is correct.
    The item could not be accessed on the remote server because its address has an invalid syntax.
    The item may be too large or corrupt. You may also verify that you have the latest version of this IFilter.
    The parameter is incorrect.
    Error HRESULT E_FAIL has been returned from a call to a COM component. (Não é possível concluir esta ação. Tente novamente.)
    Error in the application. (O tipo de campo SharePointVideoPlayer não está instalado corretamente. Acesse a página de configurações da lista para excluir o campo. )

    L. Skynyrd

    • Edited by J.Clua Thursday, September 26, 2013 5:12 PM
    Thursday, September 26, 2013 5:11 PM

All replies

  • Hmm, at one of my organizations we had about 2.5 million items adding up to be about 1.5 TB, and it took about 24 hours to perform a full crawl. Of course hardware is a consideration but if your hardware is not strained, these numbers should kind of apply. 300 hours seems like way to much time unless you have like 10 TB of data.

    The existence of <meta name="ROBOTS" content="NOHTMLINDEX" /> implies that you may have excluded this site from being crawled in Site Search Settings.

    the user SPUSER should not be given any permissions at the site collection or site levels. By default, SharePoint adds the default content access account using a Full Read web application policy. Can you look at the web application policies for all the web applications and ensure that SPUSER is added as full read?

    Please mark my response as an answer if appropriate.

    Thursday, September 26, 2013 5:22 PM
  • Hi,

    Where should I look for this tag:

     <meta name="ROBOTS" content="NOHTMLINDEX" /> ?

    Some site collections have SPUSER as site collection admin so I am removing as I find it.

    SPUSER is also a Farm Admin. Is that OK?

    Thank you

    L. Skynyrd

    • Edited by J.Clua Thursday, September 26, 2013 5:42 PM
    Thursday, September 26, 2013 5:41 PM
  • look here


    SPUSER should NOT be a farm admin. The default content access account should only be used for crawling content. You as the admin should actually never give it any permissions, SharePoint will automatically give the account the minimum permissions it needs

    Please mark my response as an answer if appropriate.

    Thursday, September 26, 2013 5:44 PM
  • Thanks,

    I have another account called SQLUSERSQL that it might had been created for this purpose and someone must have messed up.

    The problem is that when I try to change the log on user account for the Office SharePoint Server Search Account with another account I get this error:

    Error 6: The handle is invalid.


    I also have SPUSER set to SSP user. Sould this user be the same as 

    Default Content Database Account ?

    L. Skynyrd

    Thursday, September 26, 2013 6:44 PM