locked
Search Size in SharePoint 2013 RRS feed

  • Question

  • My question is about FILE SHARE content source , we have big project coming up where we tons of data to crawl from file share, but this is in long run not now, currently we have file share of 25 GB and I have crawled it I can see 13000 items crawled, How do I get size of these items in SharePoint, as we go and increase the content in FILE Share search will increase, in this case how do I calculate search size and which search database I need to think of in terms of storage and planning search size. Please explain me. Do I need to concern about index size alone, if so how do I know exact index size based on the no of items and also which database is responsible for storing search data is it crawl or any other search component database.


    Thanks, Ram Ch

    Thursday, June 25, 2015 6:27 AM

Answers

  • Hi,

    The Search service application has four databases that support SharePoint 2013. The four Search service application databases are shown in the following list. The tables that follow the list display the relevant database information.

    • Search Administration: The Search Administration database hosts the Search service application configuration and access control list (ACL) for the crawl component.
    • Analytics Reporting: The Analytics Reporting database stores the results for usage analysis reports and extracts information from the Link database when needed.
    • Crawl:The Crawl database stores the state of the crawled data and the crawl history.
    • Link: The Link database stores the information that is extracted by the content processing component and the click through information

    Following is some information which might help you with planning and estimation of search service application and it limits.

    For more information : https://technet.microsoft.com/en-us/library/jj219738.aspx

    Limit

    Maximum value

    Search service applications

    20 per farm

    Crawl databases

    5 crawl databases per search service application

    Crawl components

    2 per search service application

    Index components

    60 per Search service application

    Index partitions

    20 per search service application

    Index replicas

    3 per index partition

    Indexed items

    100 million per search service application; 10 million per index partition

    Crawl log entries

    100 million per search application

    Property databases

    10 per search service application;128 total

    Link database

    Two per Search service application

    Query processing components

    1 per server computer

    Content processing components

    One per server computer

    Scope rules

    100 scope rules per scope; 600 total per search service application

    Scopes

    200 site scopes and 200 shared scopes per search service application

    Display groups

    25 per site

    Alerts

    100,000 per search application

    Content sources

    50 per search service application

    Start addresses

    100 per content source

    Concurrent crawls

    20 per search application

    Crawled properties

    500,000 per search application

    Crawl impact rule

    no limit

    Crawl rules

    no limit

    Managed properties

    50,000 per search service application

    Values per managed property

    100

    Indexed managed property size

    512 KB per searchable/queryable managed property

    Managed property mappings

    100 per managed property

    Retrievable managed property size

    16 KB per managed property

    Sortable and refinable managed property size

    16 KB per managed property

    URL removals

    100 removals per operation

    Authoritative pages

    1 top level and minimal second and third level pages per search service application

    Keywords

    200 per site collection

    Metadata properties recognized

    10,000 per item crawled

    Analytics processing components

    6 per Search service application

    Analytics reporting database

    Four per Search service application

    Maximum eDiscovery KeywordQuery text length

    16 KB

    Maximum KeywordQuery text length

    4 KB

    Maximum length of eDiscovery KeywordQuery text at Search service application level

    20 KB

    Maximum length of KeywordQuery text at Search service application level

    20 KB

    Maximum size of documents pulled down by crawler

    64 MB (3 MB for Excel documents)

    Navigable results from search

    100,000 per query request per Search service application

    Number of entries in a custom entity extraction dictionary

    1 million

    Number of entries in a custom search dictionary

    5,000 terms per tenant

    Number of entries in a thesaurus

    1 million

    Ranking models

    1,000 per tenant

    Results removal

    No limit

    Term size

    300 characters

    Unique terms in the index

    2^31 (>2 billion terms)

    Unique contexts used for ranking

    15 unique contexts per rank model

    User defined full text indexes

    10

    Hope this helps :)

    • Proposed as answer by Sara Fan Monday, July 6, 2015 12:52 AM
    • Marked as answer by Sara Fan Tuesday, July 7, 2015 12:40 AM
    Thursday, June 25, 2015 6:44 AM
  • Yes, your crawl database can be expected to ocupy around 0.046 X (Size of data). So in your case you should be looking at about 1-1.5GB.

    • Proposed as answer by Sara Fan Monday, July 6, 2015 12:54 AM
    • Marked as answer by Sara Fan Tuesday, July 7, 2015 12:40 AM
    Thursday, June 25, 2015 8:32 AM

All replies

  • Hi,

    The Search service application has four databases that support SharePoint 2013. The four Search service application databases are shown in the following list. The tables that follow the list display the relevant database information.

    • Search Administration: The Search Administration database hosts the Search service application configuration and access control list (ACL) for the crawl component.
    • Analytics Reporting: The Analytics Reporting database stores the results for usage analysis reports and extracts information from the Link database when needed.
    • Crawl:The Crawl database stores the state of the crawled data and the crawl history.
    • Link: The Link database stores the information that is extracted by the content processing component and the click through information

    Following is some information which might help you with planning and estimation of search service application and it limits.

    For more information : https://technet.microsoft.com/en-us/library/jj219738.aspx

    Limit

    Maximum value

    Search service applications

    20 per farm

    Crawl databases

    5 crawl databases per search service application

    Crawl components

    2 per search service application

    Index components

    60 per Search service application

    Index partitions

    20 per search service application

    Index replicas

    3 per index partition

    Indexed items

    100 million per search service application; 10 million per index partition

    Crawl log entries

    100 million per search application

    Property databases

    10 per search service application;128 total

    Link database

    Two per Search service application

    Query processing components

    1 per server computer

    Content processing components

    One per server computer

    Scope rules

    100 scope rules per scope; 600 total per search service application

    Scopes

    200 site scopes and 200 shared scopes per search service application

    Display groups

    25 per site

    Alerts

    100,000 per search application

    Content sources

    50 per search service application

    Start addresses

    100 per content source

    Concurrent crawls

    20 per search application

    Crawled properties

    500,000 per search application

    Crawl impact rule

    no limit

    Crawl rules

    no limit

    Managed properties

    50,000 per search service application

    Values per managed property

    100

    Indexed managed property size

    512 KB per searchable/queryable managed property

    Managed property mappings

    100 per managed property

    Retrievable managed property size

    16 KB per managed property

    Sortable and refinable managed property size

    16 KB per managed property

    URL removals

    100 removals per operation

    Authoritative pages

    1 top level and minimal second and third level pages per search service application

    Keywords

    200 per site collection

    Metadata properties recognized

    10,000 per item crawled

    Analytics processing components

    6 per Search service application

    Analytics reporting database

    Four per Search service application

    Maximum eDiscovery KeywordQuery text length

    16 KB

    Maximum KeywordQuery text length

    4 KB

    Maximum length of eDiscovery KeywordQuery text at Search service application level

    20 KB

    Maximum length of KeywordQuery text at Search service application level

    20 KB

    Maximum size of documents pulled down by crawler

    64 MB (3 MB for Excel documents)

    Navigable results from search

    100,000 per query request per Search service application

    Number of entries in a custom entity extraction dictionary

    1 million

    Number of entries in a custom search dictionary

    5,000 terms per tenant

    Number of entries in a thesaurus

    1 million

    Ranking models

    1,000 per tenant

    Results removal

    No limit

    Term size

    300 characters

    Unique terms in the index

    2^31 (>2 billion terms)

    Unique contexts used for ranking

    15 unique contexts per rank model

    User defined full text indexes

    10

    Hope this helps :)

    • Proposed as answer by Sara Fan Monday, July 6, 2015 12:52 AM
    • Marked as answer by Sara Fan Tuesday, July 7, 2015 12:40 AM
    Thursday, June 25, 2015 6:44 AM
  • I am working with Terabytes of data with SharePoint which needs to be crawled. Make sure your search architecture has multiple crawl and query servers for HA. 

    just 25-100 gb of data would not create too much of index so dont worry about that.


    If this helped you resolve your issue, please mark it Answered. You can reach me through http://itfreesupport.com/

    Thursday, June 25, 2015 6:52 AM
  • Thanks for the responses. is there any % of search data stored in search database written some where based on actual content?

    Thanks, Ram Ch

    Thursday, June 25, 2015 7:42 AM
  • Yes, your crawl database can be expected to ocupy around 0.046 X (Size of data). So in your case you should be looking at about 1-1.5GB.

    • Proposed as answer by Sara Fan Monday, July 6, 2015 12:54 AM
    • Marked as answer by Sara Fan Tuesday, July 7, 2015 12:40 AM
    Thursday, June 25, 2015 8:32 AM
  • It depends on type of data. If you have image files, they will not be crawled at all, document pages will be crawled.

    If you have data in lists then it will take maximum size here.. SO assume 1 to 5 % based on type of data


    If this helped you resolve your issue, please mark it Answered. You can reach me through http://itfreesupport.com/

    Friday, June 26, 2015 7:04 AM