none
Failover of the query and index roles for FS4SP RRS feed

  • Question

  • I've have been evaluating Fast Search for SharePoint 2010 in the lab.  Right now I am using a modified deployment.sample.multi1.xml sample file for a two server search farm (basically swapped the host names, and tried with and without the crawler element in node1).  The notes found in this sample file is that it supports failover, but does not note what parts fails over. 

     

    Will this sample deployment file support fail over of both the query and index role?

     

    From my testing in the lab, I'm seeing that the query will failover, but not the index role, is this to be expected?  (below has more info of what I'm seeing and some error message I'm seeing in the event log).

     

    In my testing, the query/search role will work if either node1 (admin role) or node2 is offline, but when the SharePoint SSA for the Fast Connector crawls content only when the node1 is online, but looking at the TechNet deployment.xml reference, I would expect the secondary node2 to pick up the content crawl. (Note, that when both are online it appears that both node1 and node2 are processing content).

     

    When Node1 is offline, and try to run a content crawl, I'm seeing errors in the event log, like the following in Node2:

     

    * Content Distributor master seems to have died. Trying to become master.

     

    * configuration: No response from the config server. Will wait 5 seconds for config server to respond before proceeding.

     

    * column::state: This indexer is not allowed to become master. Waiting for master.

     

    Once Node1 comes back online, the errors disappear and the crawl will complete.

     

    Is what I'm seeing to be expected with this sample deployment file?

     

     

    Wednesday, June 22, 2011 6:14 PM

Answers

  • Hi,

    I think this is the documentation you are looking for: Replace a failed backup indexer server

    To quote from the page:

    "If the farm deployment includes a backup indexer row, you must manually change the deployment configuration if a non-recoverable server error occurs on the backup indexer server."

    Hope this helps.

    Regards

     


    Thomas Svensen | Microsoft Enterprise Search Practice
    • Proposed as answer by Thomas SvensenModerator Wednesday, June 29, 2011 1:28 PM
    • Marked as answer by nayr Sunday, August 14, 2011 2:01 AM
    Wednesday, June 29, 2011 1:28 PM
    Moderator

All replies

  • Hello nayr,

    I think this is actually expected as the admin service is down. When you take out node1, you loose the nameservice which is needed for cross communication between the nodes. You can try to just stop the content distributor, indexer and indexing dispatcher processes on node 1, to see if node 2 will take over crawling.

    The admin server is a single point of failure in FAST for SharePoint, and will affect the system. In FAST ESP it was possible to have redundancy on the nameservice as well, but I haven't tried this with FAST for SharePoint. 

    Regards,
    Mikael Svenson 


    Search Enthusiast - MCTS SharePoint/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    • Proposed as answer by Thomas SvensenModerator Friday, June 24, 2011 8:17 AM
    • Unproposed as answer by nayr Monday, June 27, 2011 12:04 PM
    Thursday, June 23, 2011 7:58 PM
  •  

    Hi Mikael, thanks for the tips.  Looks like the index will complete when the content distributor and indexing dispatcher processes are stopped on node 1 (the primary w/ the admin role).  When the index process is stopped on node 1, the same thing is happening where the indexer on node 2 never picks up and I start to see the following event messages on node 2 :

     

    * column::state: Column master is gone

    * column::state: This indexer is not allowed to become master. Waiting for master.

     

    -Ryan

    Friday, June 24, 2011 2:11 PM
  • your problem is that you are bringing your Admin node down (config). is you have admin running but the index server (node 1) down, you should fail over to the other node (2)
    SharePoint MVP, Microsoft VTSP, http://www.arcovis.com
    Friday, June 24, 2011 8:24 PM
  • Hi Natalya, on the admin server (node 1), all of the processes are running including the Config Server.  The issue is when I manually stop the indexer process via the nctrl command on node 1 - the index role never picks up on node 2 (where the indexer process is running).
    • Edited by nayr Tuesday, June 28, 2011 10:49 AM Typo - wrong node
    Monday, June 27, 2011 12:04 PM
  • Hi Natalya, on the admin server (node 1), all of the processes are running including the Config Server.  The issue is when I manually stop the indexer process via the nctrl command on node 2 - the index role never picks up on node 2 (where the indexer process is running).

    Nayr,

    When you issue "nctrl stop indexer" on node 2, then of course node 2 will not pick up :) Did you just write it wrong, that you stopped indexer on node 1, and node 2 would not pick up?

    -m 


    Search Enthusiast - MCTS SharePoint/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Monday, June 27, 2011 6:58 PM
  • Sorry, I miss typed - when indexer fails on node 1, indexer on node 2 never picks up. 

    Sounds like what I'm seeing is not correct - that the indexer should be picking up, I don't see any errors in the sharepoint logs, so I'll open a ticket w/ Support, and report back.

    Tuesday, June 28, 2011 10:51 AM
  • Hi,

    I think this is the documentation you are looking for: Replace a failed backup indexer server

    To quote from the page:

    "If the farm deployment includes a backup indexer row, you must manually change the deployment configuration if a non-recoverable server error occurs on the backup indexer server."

    Hope this helps.

    Regards

     


    Thomas Svensen | Microsoft Enterprise Search Practice
    • Proposed as answer by Thomas SvensenModerator Wednesday, June 29, 2011 1:28 PM
    • Marked as answer by nayr Sunday, August 14, 2011 2:01 AM
    Wednesday, June 29, 2011 1:28 PM
    Moderator
  • Hi Thomas, I reviewed TechNet gg482021 (Replace a failed primary indexer server), and it appears my expectation that the index role fails over automatically when a row contains a backup/secondary index is incorrect. 

     

    Just to confirm, is the following statement correct: FS4SP does not support automatic fail over of the indexer role when a backup indexer is defined on a seperate row.  The fail-over of the index role must be manually performed using instructions noted in the TechNet article gg482021.

    Wednesday, June 29, 2011 7:37 PM
  • This is confirmed, and the documentation was updated in June with the detailed steps. Hopefully we will get an automatic switching in the future., and I somewhat assumes what you assumes as well .)

    Regards,
    Mikael Svenson 


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Saturday, July 16, 2011 5:39 PM