none
Errors When Crawling Large Document Libraries RRS feed

  • Question

  • I have a document library that is on the slightly larger side, approximately 280,000 documents and images broken up into sub folders.   Most of the documents are stored in two main folders each with about 100,000 documents.  I'd like to break them up into smaller sub folders, but because of an interface with another system this cannot be done.

    When I run a crawl of the document library, the crawl will sometimes fail on the two folders with 100k documents with the following error:

    The item may be too large or corrupt. You may also verify that you have the latest version of this IFilter.

    Sometimes it will index the folders and other times it won't, which makes it challenging to determine where the problem lies.  The document library is nowhere near the 10 million file limit. Has anyone else run into this issue?  
    Thursday, July 3, 2008 9:12 PM

All replies

  • Did you ever figure this issue out?  I am having similar issues but I don't seem to find many answers.
    Monday, April 27, 2009 8:31 PM
  • Me too,

     

    So have you got any proper solution for this?

     

    Thanks

    Andy

    Thursday, August 18, 2011 11:18 AM
  • Hi,

    First of all I would suggest that to not to store this much data on a single document library but instead store them in different ones.

    Next check the disk drive space where you used to store your index files. Make sure that at least 30-35 % space is always free in it.

     

     


    Thanks, Rahul Rashu
    Friday, August 26, 2011 7:28 AM
  • Set the default values as given below in this registry "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager"

    FilterProcessMemoryQuota: 6400000
    DedicatedFilterProcessMemoryQuota: 6400000

    - Restarted the Osearch service.
    - Perform a Full Crawl


    Regards, Vishwas
    Saturday, October 22, 2011 9:44 PM