There are several reasons to monitor FAST Search Server 2010. This wiki page is meant to document each of the reasons and provide guidance on what to monitor.

What are Some use cases that one should monitor?
  • Are the crawls successfully gathering content?
  • Are we approaching the recommended document limit?
  • Is the search latency within an expected range?
  • Are there an excessive number of failed queries?
  • Are the document processors active?
  • Is the system indexing content?
  • It appears that the feeding is slow. How do I diagnose the performance?
How should you monitor?
 The recommended solution is to use SCOM 2007 SP1 or R2. Find more details here.

A full list of the performance counters and WMI classes can be found on the TechNet site

The recommended performance counters are listed below:
  • Disk counters
    • "\LogicalDisk(*)\Avg. Disk sec/Read"
    • "\LogicalDisk(*)\Avg. Disk sec/Write"
    • "\LogicalDisk(*)\Disk Read Bytes/sec"
    • "\LogicalDisk(*)\Disk Reads/sec"
    • "\LogicalDisk(*)\Disk Write Bytes/sec"
    • "\LogicalDisk(*)\Disk Writes/sec"
    • "\LogicalDisk(*)\Current Disk Queue Length"
    • "\LogicalDisk(*)\% Idle Time"
    • "\LogicalDisk(*)\% Free space"
    • "\LogicalDisk(*)\Avg. Disk Read Queue Length"
    • "\LogicalDisk(*)\Avg. Disk Write Queue Length"
  • Memory counters
    • "\Memory\Available MBytes"
    • "\Memory\Pages Input/sec"
    • "\Memory\Pages Output/sec"
    • "\Memory\Free System Page Table Entries"
    • "\Memory\Pages/sec"
    • "\Memory\Pool Paged Bytes"
    • "\Memory\Pool Nonpaged Bytes"
  • Network counters
    • "\Network Interface(*)\Bytes Received/sec"
    • "\Network Interface(*)\Bytes Sent/sec"
    • "\Network Interface(*)\Output Queue Length"
    • "\Network Interface(*)\Packets Received/sec"
    • "\Network Interface(*)\Packets Sent/sec"
  • CPU counter
    • "\Processor(*)\% Processor Time"
  • FAST Search counters
    • "\FAST Search Content Distributor(*)\Average dispatch time - ms"
    • "\FAST Search Content Distributor(*)\Average processing time - ms"
    • "\FAST Search Content Distributor(*)\Document processors" - This is the count of currently registered document processors. Set an alert at 0 and increase the number to the number you expect to be registered. For example if you know there are 20 document processors set the alert at 15.
    • "\FAST Search Content Distributor(*)\Document processors busy"
    • "\FAST Search Indexer(*)\API queue size"
    • "\FAST Search Indexer(*)\Documents in indexer" - This reports how many documents are in the index. Alert if this falls below the minimum number of documents expected in the index. Alert when this number is greater than the max number of documents in the index. Warn when you are nearing the max.
    • "\FAST Search Indexer(*)\Disk low state" - Tracks % of available disk space. Alert if this is below 10 and warn if this is below 20.
    • "\FAST Search Indexer(*)\Fixml fill rate"
    • - Measures the level of fragmentation of documents. 100 is ideal. It is recommended to warn at 30 and alert at 15.
    • "\FAST Search Indexer(*)\Time since last Index" - This value is set to 0 every time a partition completes indexing. This setting varies with different systems. If your system continuously indexes, then set a warning at 5 min and an alert at 15 minutes.
    • "\FAST Search Indexer Partition(*)\Active documents"
    • "\OSS Search FAST Content Plugin(*)\Batches Open"
    • "\OSS Search FAST Content Plugin(*)\Batches Ready"
    • "\OSS Search FAST Content Plugin(*)\Batches Submitted"
    • "\OSS Search FAST Content Plugin(*)\Items Total"
    • “\FAST Search QRServer(*)\# Queries/sec” - This number should be tracked and stored for graphing purposes. It is useful to know query behavior and identify spikes.
    • “\FAST Search QRServer(*)\# of Fail user queries/sec” - There are expected to be some failed queries so the alert should not be set to 0. Set the alert to 2 as this would indicate a likely issue with the system.
    • “\FAST Search QRServer(*)\# Requests/sec”
    • “\FAST Search QRServer(*)\Average queries per minute”
    • “\FAST Search QRServer(*)\Average latency last - ms”  - This is the average time that search requests take to write back to the requesting client. Alert if the number goes over 500. This should be tracked over time.
    • “\FAST Search QRServer(*)\Peak queries per sec”