none
File share witness is in failed state in windows 2012 r2 RRS feed

  • Question

  • Hello ,

    server : 2012 r2 , azure

    what could be the reason for why file share witness is in failed state :

    we found below on event logs :

    09-04-2019 7:25:35 PM Warning xxxxx 1562 Microsoft-Windows-FailoverClustering File Share Witness Resource NT AUTHORITY\SYSTEM File share witness resource 'File Share Witness' failed a periodic health check on file share '\\xxxx'. Please ensure that file share '\\xxxx' exists and is accessible by the cluster.  
    09-04-2019 7:25:35 PM Error xxxxxx 1069 Microsoft-Windows-FailoverClustering Resource Control Manager NT AUTHORITY\SYSTEM Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.  
    09-04-2019 7:25:35 PM Critical xxxx  1564 Microsoft-Windows-FailoverClustering File Share Witness Resource NT AUTHORITY\SYSTEM File share witness resource 'File Share Witness' failed to arbitrate for the file share '\\cluster-fsw\agilitysql-fsw'. Please ensure that file share xxxxxxx' exists and is accessible by the cluster.  
    09-04-2019 7:25:35 PM Error xxxx 1069 Microsoft-Windows-FailoverClustering Resource Control Manager NT AUTHORITY\SYSTEM Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.  
    09-04-2019 7:25:35 PM Error xxxx 1205 Microsoft-Windows-FailoverClustering Resource Control Manager NT AUTHORITY\SYSTEM The Cluster service failed to bring clustered role 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.  

    cluster log :

    ::2019/09/04-19:25:29.136 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.136 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.242 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.242 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.244 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.244 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.357 INFO  [API] s_ApiGetQuorumResource final status 0.
    00000c2c.00002cd8::2019/09/04-19:25:29.357 INFO  [API] s_ApiGetQuorumResource final status 0.
    000010a0.000030ac::2019/09/04-19:25:33.541 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6f6e37ed-4875-4ea5-85de-e3e88b6377f1:Netbios
    00000fbc.00001b1c::2019/09/04-19:25:35.252 ERR   [RES] File Share Witness <File Share Witness>: Failed to create or open directory \\cluster-fsw\agilitysql-fsw\69d6aab1-c506-4b9a-9cb5-7b2e53180562, error 53.
    00000fbc.00001b1c::2019/09/04-19:25:35.252 ERR   [RES] File Share Witness <File Share Witness>: Failed to validate an access to the active share \\cluster-fsw\agilitysql-fsw\69d6aab1-c506-4b9a-9cb5-7b2e53180562 with 53.
    00000fbc.00001b1c::2019/09/04-19:25:35.257 WARN  [RHS] Resource File Share Witness IsAlive has indicated failure.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'File Share Witness', gen(0) result 1/0.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] Res File Share Witness: Online -> ProcessingFailure( StateUnknown )
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] TransitionToState(File Share Witness) Online-->ProcessingFailure.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (Cluster Group, Online --> Pending)
    00000c2c.00002f00::2019/09/04-19:25:35.258 ERR   [RCM] rcm::RcmResource::HandleFailure: (File Share Witness)
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [QUORUM] Node 2: PostRelease for 69d6aab1-c506-4b9a-9cb5-7b2e53180562
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [QUORUM] Node 2: quorum is not owned by anyone
    00000c2c.00000e24::2019/09/04-19:25:35.258 INFO  [RCM] ignored non-local state Pending for group Cluster Group
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] resource File Share Witness: failure count: 0, restartAction: 0 persistentState: 1.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] numDependents is zero, auto-returning true
    00000c2c.00002cd8::2019/09/04-19:25:35.258 INFO  [GUM] Node 2: Processing RequestLock 2:404
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] Will queue immediate restart (500 milliseconds) of File Share Witness after terminate is complete.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] Res File Share Witness: ProcessingFailure -> WaitingToTerminate( DelayRestartingResource )
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] TransitionToState(File Share Witness) ProcessingFailure-->[WaitingToTerminate to DelayRestartingResource].
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] Res File Share Witness: [WaitingToTerminate to DelayRestartingResource] -> Terminating( DelayRestartingResource )
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] TransitionToState(File Share Witness) [WaitingToTerminate to DelayRestartingResource]-->[Terminating to DelayRestartingResource].
    00000fbc.00001b1c::2019/09/04-19:25:35.258 INFO  [RES] File Share Witness <File Share Witness>: Terminating resource ...
    00000fbc.00001b1c::2019/09/04-19:25:35.258 INFO  [RES] File Share Witness <File Share Witness>: Resource is offline.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] HandleMonitorReply: TERMINATERESOURCE for 'File Share Witness', gen(1) result 0/0.
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] Res File Share Witness: [Terminating to DelayRestartingResource] -> DelayRestartingResource( StateUnknown )
    00000c2c.00002f00::2019/09/04-19:25:35.258 INFO  [RCM] TransitionToState(File Share Witness) [Terminating to DelayRestartingResource]-->DelayRestartingResource.
    00000c2c.00002f00::2019/09/04-19:25:35.258 WARN  [RCM] Queueing immediate delay restart of resource File Share Witness in 500 ms.
    00000c2c.0000209c::2019/09/04-19:25:35.258 INFO  [GUM] Node 2: Processing GrantLock to 2 (sent by 1 gumid: 6461)
    00000c2c.00002cd8::2019/09/04-19:25:35.259 INFO  [GUM] Node 2: executing request locally, gumId:6462, my action: qm/set-node-weight, # of updates: 1
    00000c2c.00002cd8::2019/09/04-19:25:35.259 WARN  [QUORUM] Node 2: weight adjustment not performed. Cannot go below weight count 3 in a hybrid configuration with 2+ nodes
    00000c2c.00002cd8::2019/09/04-19:25:35.259 INFO  [DM] An empty single transaction is cancelled 134:134:74205+1::0
    00000c2c.00002cd8::2019/09/04-19:25:35.768 INFO  [RCM] Delay-restarting File Share Witness and any waiting dependents.
    00000c2c.00002cd8::2019/09/04-19:25:35.768 INFO  [RCM] Res File Share Witness: DelayRestartingResource -> OnlineCallIssued( StateUnknown )
    00000c2c.00002cd8::2019/09/04-19:25:35.768 INFO  [RCM] TransitionToState(File Share Witness) DelayRestartingResource-->OnlineCallIssued.
    00000c2c.00002cd8::2019/09/04-19:25:35.768 INFO  rcm::RcmResource::OnlineWorker[RCM] Issuing Arbitrate(File Share Witness) to RHS.
    00000fbc.00001b1c::2019/09/04-19:25:35.768 INFO  [RHS] Enqueuing Arbitrate call.
    00000fbc.00001b1c::2019/09/04-19:25:35.768 INFO  [RHS] Waiting for Arbitrate call to be dequeued.
    00000fbc.00001ac8::2019/09/04-19:25:35.768 INFO  [RES] File Share Witness <File Share Witness>: Beginning arbitration ...
    00000fbc.00001ac8::2019/09/04-19:25:35.769 ERR   [RES] File Share Witness <File Share Witness>: Failed to create or open directory \\cluster-fsw\agilitysql-fsw\69d6aab1-c506-4b9a-9cb5-7b2e53180562, error 53.
    00000fbc.00001ac8::2019/09/04-19:25:35.769 ERR   [RES] File Share Witness <File Share Witness>: Failed to arbitrate for \\cluster-fsw\agilitysql-fsw\69d6aab1-c506-4b9a-9cb5-7b2e53180562 with 53.
    00000fbc.00001b1c::2019/09/04-19:25:35.779 INFO  [RHS] Waiting for Arbitrate call to be completed.
    00000c2c.00002cd8::2019/09/04-19:25:35.787 ERR   [RCM] Arbitrating resource 'File Share Witness' returned error 53
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] Res File Share Witness: OnlineCallIssued -> ProcessingFailure( StateUnknown )
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] TransitionToState(File Share Witness) OnlineCallIssued-->ProcessingFailure.
    00000c2c.00002cd8::2019/09/04-19:25:35.787 ERR   [RCM] rcm::RcmResource::HandleFailure: (File Share Witness)
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [QUORUM] Node 2: PostRelease for 69d6aab1-c506-4b9a-9cb5-7b2e53180562
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [QUORUM] Node 2: quorum is not owned by anyone
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] resource File Share Witness: failure count: 1, restartAction: 2 persistentState: 1.
    00000c2c.000025cc::2019/09/04-19:25:35.787 INFO  [GUM] Node 2: executing request locally, gumId:6463, my action: qm/set-node-weight, # of updates: 1
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] numDependents is zero, auto-returning true
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] Resource File Share Witness is causing group Cluster Group to failover.
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] rcm::RcmGroup::Failover: (Cluster Group)
    00000c2c.000025cc::2019/09/04-19:25:35.787 WARN  [QUORUM] Node 2: weight adjustment not performed. Cannot go below weight count 3 in a hybrid configuration with 2+ nodes
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] time since last failure is greater than failover period; resetting failoverCount to 0.
    00000c2c.00002cd8::2019/09/04-19:25:35.787 WARN  [RCM] Failing over group Cluster Group, failoverCount 1, last time 2019/07/26-07:01:08.916.
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] This node is director, making placement decision myself
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter NodeDownFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter NodeShuttingDownFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter CurrentNodeFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] removing banned candidate: NodeCandidate(2) banCode: CurrentNodeFilter (5016)
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter PausedNodeFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.000025cc::2019/09/04-19:25:35.787 INFO  [DM] An empty single transaction is cancelled 134:134:74205+1::0
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter PossibleOwnerFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM] Group Cluster Group: done going through resources, returning true
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter QueuingPerviouslyRejectedFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter PreferredOwnerWaitFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter AntiAffinityFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00002cd8::2019/09/04-19:25:35.787 INFO  [RCM-plcmt] applying filter StmFilter to group Cluster Group moveType=MoveType::Failover
    00000c2c.00001df8::2019/09/04-19:25:35.788 INFO  [RCM] applying timely STM connectivity snapshot for group Cluster Group
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM-plcmt] filters reduced candidate list for group Cluster Group to size 1, no need to run rankers
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM-plcmt] placement manager result:  grp=Cluster Group moveType=MoveType::Failover, node=1
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  MTimer(GetPlacementAsDirector): [Total: 0 ms ( 0 s )]
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM] Res File Share Witness: ProcessingFailure -> WaitingToTerminate( Failed )
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM] TransitionToState(File Share Witness) ProcessingFailure-->[WaitingToTerminate to Failed].
    00000c2c.00001df8::2019/09/04-19:25:35.788 INFO  [RCM] rcm::RcmGroup::FailoverWorker: (Cluster Group)
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM] Res File Share Witness: [WaitingToTerminate to Failed] -> Terminating( Failed )
    00000c2c.00002cd8::2019/09/04-19:25:35.788 INFO  [RCM] TransitionToState(File Share Witness) [WaitingToTerminate to Failed]-->[Terminating to Failed].
    00000fbc.00001b1c::2019/09/04-19:25:35.788 INFO  [RHS] not calling terminate() since File Share Witness has already been terminated.
    00000c2c.00002f00::2019/09/04-19:25:35.788 INFO  [RCM] HandleMonitorReply: TERMINATERESOURCE for 'File Share Witness', gen(2) result 0/0.
    00000c2c.00002f00::2019/09/04-19:25:35.788 INFO  [RCM] Res File Share Witness: [Terminating to Failed] -> Failed( StateUnknown )
    00000c2c.00002f00::2019/09/04-19:25:35.788 INFO  [RCM] TransitionToState(File Share Witness) [Terminating to Failed]-->Failed.
    00000c2c.00002f00::2019/09/04-19:25:35.788 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (Cluster Group, Pending --> Failed)
    00000c2c.00000e24::2019/09/04-19:25:35.788 INFO  [RCM] ignored non-local state Failed for group Cluster Group
    00000c2c.00001b24::2019/09/04-19:25:35.788 INFO  [RCM] moved 0 tasks from staging set to task set.  TaskSetSize=0
    00000c2c.00001b24::2019/09/04-19:25:35.788 INFO  [RCM] rcm::RcmPriorityManager::StartGroups: [RCM] done, executed 0 tasks
    00000c2c.00001df8::2019/09/04-19:25:35.839 INFO  [RCM] ordering failover of group Cluster Group to node 1
    00000c2c.00001df8::2019/09/04-19:25:35.839 INFO  [RCM] rcm::RcmGroup::Move: (Cluster Group, 1, MoveType::Failover)
    00000c2c.00001df8::2019/09/04-19:25:35.839 INFO  [RCM] rcm::RcmGroup::Move: Bringing group 'Cluster Group' offline first...


    swathi

    Tuesday, September 17, 2019 8:49 PM

All replies

  • Hi Swatsaswath,

    Are you running SQL Server on an Azure VM? This issue needs to go to Azure Support to run some diagnostics and see if there is an underlying infrastructure issue.

    Do you have an Azure Support Plan? If so, please open a support request. If not, can you send us your Azure Subscription ID to AzCommunity. We will return instructions to have a one-time support request created to investigate this specific issue.

    Regards,

    Mike

    Friday, September 20, 2019 1:53 AM
    Moderator
  • Thank you for the reply.Actually I have asked ticket owner to check with azure disk performance

    swathi

    Friday, September 20, 2019 7:19 PM