capacity question RRS feed

  • Question

  • when one has a single collection (the default sp) fast creates a series of directories within "data" directory of the fast installation tree.

    my question is are the number of diretcories equal to the number returned by Get-FastCollectio...

    I was crawling happily along for over 200 hours had a single content collection of 2.1 million items and now I am starting to see all sorts of disk related errors.

    Should I have broken my crawls into sperate content collections?


    Friday, February 24, 2012 2:43 PM

All replies

  • Hi,

    The default is to use one collection and it does not map to the directory structure. A content collection is merely a logical grouping added an extra field "meta.collection" to the items you are indexing.

    You should be able to handle up to 10-15 million items on one server without having to worry about folder layout. My question is, what kind of disk related errors?

    Mikael Svenson

    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4

    Friday, February 24, 2012 7:45 PM
  • Just to add to this, below are capacity planning guidelines that you may find helpful:

    >>As a general guideline you should deploy one index column per 15 million items to index.

    >>The content basis for this general guideline is a mix of 70% SharePoint items and 30% items from file >>shares where the source item size is between 10 and 500 kilobytes (100-250 on average).

    Your numbers are way below that, so it would help to know what errors you are seeing.   Technically, there is no reason why you shouldn't have more than 2.1 million docs on a single index column with "sp" collection(default collection). 

    Igor Veytskin

    Monday, February 27, 2012 2:13 PM
  • thankyou for your appears that I have a bum raid controller..I am going to replace the card and then redo a crawl.

    Also thanks for the guidlelines/rule of thumb.


    Monday, February 27, 2012 4:20 PM
  • I suppose one thing I should mention is that my ratio of share point content to files share content is more like 1% from sharepoint and 99% from file shares. Does this have an impact on the directory try with the fast data directory?
    Monday, February 27, 2012 4:52 PM
  • I think this will really depend on your average document size so whether you have mainly fileshare content or Sharepoint content should not make as much of a difference as the average size.

    If this is your first full crawl, it will be a good benchmark test as to how many documents you can get into the index based on your specific content, once you sort out your disk issues.

    Igor Veytskin

    Tuesday, February 28, 2012 3:31 PM