none
SMBClient Errors: 30611 followed by multiple 30906

    Question

  • Hello,

    I have researched online but could not find anything relevant. A hyper-v managed windows server 2016 instance is crashing and throws multiple events of:

    "The IO operation at logical block address 0x###### for Disk 0 "

     On the hyper-v host, after the VM has crashed there has been a pattern of a SMBClient 30611 error of:

    "

    Failed to reconnect a persistent handle.

    Error: The account is not authorized to login from this station.

    FileId: 0x200000E0265CBEE:0x20E000000A9
    CreateGUID: {b3d6066e-563c-11e8-a949-0002c937dda1}
    Path: \networked\path\to\instance.vhdx


    Reason: 201

    Previous reconnect error: STATUS_SUCCESS
    Previous reconnect reason: The reason is not specified

    Guidance:
    A persistent handle allows transparent failover on Windows File Server clusters. This event has many causes and does not always indicate an issue with SMB. Review online documentation for troubleshooting information.

    "

    Followed by several 30906 errors:

    "

    A request on persistent/resilient handle failed because the handle was invalid or it exceeded the timeout.

    Status: The transport connection is now disconnected.

    Type: Write (and Read)
    Path: \networked\path\to\instance.vhdx
    Restart count: 0

    Guidance:
    After retrying a request on a Continuously Available (Persistent) handle or a Resilient handle, the client was unable to reconnect the handle. This event is the result of a handle recovery failure. Review other events for more details.

    "

    Then the server crashed. If someone has any ideas or could point me in a direction to recover other more logs that would be super.

    Thanks!

    Sunday, May 13, 2018 7:27 PM

All replies

  • Hi,

    Based on my knowledge, error 30611 usually means SMB client failed to resume a continuous available (CA) handle on a CA file share resource. And "Error: The account is not authorized to login from this station" may happen when it connects to other storage location.

    May I ask is there any external storage connect to your hyper-v host?

    And where is your VM located? local drive on the host or other external storage the connected to the hyper-v host?

    Event 153 “The IO operation at logical block address 0x###### for Disk 0” is an error associated with the storage subsystem. Please run chkdsk on the volume.  Have you run that to see if it finds and corrects any errors?

    https://blogs.msdn.microsoft.com/ntdebugging/2013/04/30/interpreting-event-153-errors/

    https://social.technet.microsoft.com/Forums/en-US/e11ba3c3-a383-4c99-9702-0c7de692a19f/the-io-operation-at-logical-block-address-for-disk-was-retried?forum=winserverfiles

    In addition, please try to move all the VM related data to another drive and check the results.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, May 14, 2018 6:28 AM
    Moderator
  • Hello,

    The vhdx is located on our s2d cluster for our azure pack environment. The cluster is being used by over 50 guests, and our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.

    Is there something s2d specific I could check that would be related?

    Thanks!
    Tuesday, May 15, 2018 4:08 PM
  • Hi,

    You may run Get-StorageHealthReport to check the storage health report.

    https://docs.microsoft.com/en-us/powershell/module/storage/get-storagehealthreport?view=win10-ps

    And also run get-virtualdisk | fl, get-physicaldisk | fl, get-storagepool to check if the S2D is a health status.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Wednesday, May 16, 2018 1:43 AM
    Moderator
  • Hello,

    Failover cluster manager shows that all the disks in the pool are healthy.

    Thursday, May 17, 2018 1:19 PM
  • Hi,
    >our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.

    May I ask what the current status of this VM.

    If you restart the VM, could the VM running in health status?

    As said above, event 30611  can occur if the volume for the file share does not have the Resume Key filter attached or similar reasons.

    Applications with continuously available file handles opened on file shares on the affected volumes will have the handles closed. The application administrator will need to take recovery action to reestablish those handles. This could include restarting virtual machines, reattaching databases or restarting the application.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, May 18, 2018 1:24 AM
    Moderator
  • The current status is that it is powered off, but post crash it was not booted in safe mode.

    Post crash/boot everything was "working", but the VM spat out additional IO retries with no consistency. As in 5 and 2 days apart at different times of the day.

    All the applications and services were 'restarted' when the VM powered on, yes? What do you mean reattaching databases? The VM uses a networked SQL server.



    Monday, May 21, 2018 1:35 PM
  • We think it has something to do with the vhdx and not anything to do with our s2d cluster, as other VMs would be having this issue.

    Have any ideas that I could look into?
    Tuesday, May 29, 2018 9:53 PM
  • We are seeing 30906 events on our Hyperv servers as well; however our storage is located on SOFS Clusters. VM's are hanging and cannot failover to another host since the configuration.xml file for the VM's is locked. Only turning off the Host will solve the issue since then the locked .xml is freed.

    Have you been able to solve it or find a workaround for it? What steps did you take ?


    Jan

    Friday, July 20, 2018 12:42 PM
  • We are seeing VMs on our Hyper-V host crashed with smberror 30611 follow by 30906 as well. our storage is on sofs

    VMs are hanging and could not be migrated off. The only solution is to restart the Hyper-V Host so hanging VMs can be started on other nodes.

    Anyone has a solution will be greatly appreciated

    Thanks

    Qui 

     
    Monday, August 27, 2018 3:51 AM
  • Hello Qui,

    We solved the issue by setting SMBMultichannel off on the HyperV hosts...

    Set-SmbClientConfiguration -EnableMultiChannel $false

    After changing this setting we did not experience the problem anymore..

    Microsoft will release a hotfix for the issue in September.. (needs to be installed on SOFS)

    Hope this helps!


    Jan

    Monday, August 27, 2018 9:11 AM
  • Hi Jan,

    thanks for the tip, will try it out.

    do you know why multichannel is causing the problem?

    Thanks again

    Qui

    Tuesday, August 28, 2018 4:51 AM
  • Would you happen to have the MS link describing this issue?
    Wednesday, August 29, 2018 8:24 PM
  • There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.

    I cannot elaborate on this but will keep you posted when the hotfix is publicly available.


    Jan


    Thursday, August 30, 2018 7:00 AM
  • There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.

    I cannot elaborate on this but will keep you posted when the hotfix is publicly available.


    Jan

    Thursday, August 30, 2018 7:00 AM
  • Do you mean September 18 ?
    Thursday, August 30, 2018 7:28 PM
  • Yes, sorry, September 18 it is.

    Jan

    Friday, August 31, 2018 5:47 AM
  • So now can you say what it is?
    Thursday, September 20, 2018 4:09 PM
  • I am interested in this SMB bug as well. After googling around I was unable to find anything specific regarding a bug or an upcoming hotfix.
    Sunday, September 23, 2018 4:36 PM