none
SMBClient Errors: 30611 followed by multiple 30906

    Domanda

  • Hello,

    I have researched online but could not find anything relevant. A hyper-v managed windows server 2016 instance is crashing and throws multiple events of:

    "The IO operation at logical block address 0x###### for Disk 0 "

     On the hyper-v host, after the VM has crashed there has been a pattern of a SMBClient 30611 error of:

    "

    Failed to reconnect a persistent handle.

    Error: The account is not authorized to login from this station.

    FileId: 0x200000E0265CBEE:0x20E000000A9
    CreateGUID: {b3d6066e-563c-11e8-a949-0002c937dda1}
    Path: \networked\path\to\instance.vhdx


    Reason: 201

    Previous reconnect error: STATUS_SUCCESS
    Previous reconnect reason: The reason is not specified

    Guidance:
    A persistent handle allows transparent failover on Windows File Server clusters. This event has many causes and does not always indicate an issue with SMB. Review online documentation for troubleshooting information.

    "

    Followed by several 30906 errors:

    "

    A request on persistent/resilient handle failed because the handle was invalid or it exceeded the timeout.

    Status: The transport connection is now disconnected.

    Type: Write (and Read)
    Path: \networked\path\to\instance.vhdx
    Restart count: 0

    Guidance:
    After retrying a request on a Continuously Available (Persistent) handle or a Resilient handle, the client was unable to reconnect the handle. This event is the result of a handle recovery failure. Review other events for more details.

    "

    Then the server crashed. If someone has any ideas or could point me in a direction to recover other more logs that would be super.

    Thanks!

    domenica 13 maggio 2018 19:27

Tutte le risposte

  • Hi,

    Based on my knowledge, error 30611 usually means SMB client failed to resume a continuous available (CA) handle on a CA file share resource. And "Error: The account is not authorized to login from this station" may happen when it connects to other storage location.

    May I ask is there any external storage connect to your hyper-v host?

    And where is your VM located? local drive on the host or other external storage the connected to the hyper-v host?

    Event 153 “The IO operation at logical block address 0x###### for Disk 0” is an error associated with the storage subsystem. Please run chkdsk on the volume.  Have you run that to see if it finds and corrects any errors?

    https://blogs.msdn.microsoft.com/ntdebugging/2013/04/30/interpreting-event-153-errors/

    https://social.technet.microsoft.com/Forums/en-US/e11ba3c3-a383-4c99-9702-0c7de692a19f/the-io-operation-at-logical-block-address-for-disk-was-retried?forum=winserverfiles

    In addition, please try to move all the VM related data to another drive and check the results.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    lunedì 14 maggio 2018 06:28
    Moderatore
  • Hello,

    The vhdx is located on our s2d cluster for our azure pack environment. The cluster is being used by over 50 guests, and our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.

    Is there something s2d specific I could check that would be related?

    Thanks!
    martedì 15 maggio 2018 16:08
  • Hi,

    You may run Get-StorageHealthReport to check the storage health report.

    https://docs.microsoft.com/en-us/powershell/module/storage/get-storagehealthreport?view=win10-ps

    And also run get-virtualdisk | fl, get-physicaldisk | fl, get-storagepool to check if the S2D is a health status.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    mercoledì 16 maggio 2018 01:43
    Moderatore
  • Hello,

    Failover cluster manager shows that all the disks in the pool are healthy.

    giovedì 17 maggio 2018 13:19
  • Hi,
    >our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.

    May I ask what the current status of this VM.

    If you restart the VM, could the VM running in health status?

    As said above, event 30611  can occur if the volume for the file share does not have the Resume Key filter attached or similar reasons.

    Applications with continuously available file handles opened on file shares on the affected volumes will have the handles closed. The application administrator will need to take recovery action to reestablish those handles. This could include restarting virtual machines, reattaching databases or restarting the application.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    venerdì 18 maggio 2018 01:24
    Moderatore
  • The current status is that it is powered off, but post crash it was not booted in safe mode.

    Post crash/boot everything was "working", but the VM spat out additional IO retries with no consistency. As in 5 and 2 days apart at different times of the day.

    All the applications and services were 'restarted' when the VM powered on, yes? What do you mean reattaching databases? The VM uses a networked SQL server.



    lunedì 21 maggio 2018 13:35
  • We think it has something to do with the vhdx and not anything to do with our s2d cluster, as other VMs would be having this issue.

    Have any ideas that I could look into?
    martedì 29 maggio 2018 21:53
  • We are seeing 30906 events on our Hyperv servers as well; however our storage is located on SOFS Clusters. VM's are hanging and cannot failover to another host since the configuration.xml file for the VM's is locked. Only turning off the Host will solve the issue since then the locked .xml is freed.

    Have you been able to solve it or find a workaround for it? What steps did you take ?


    Jan

    venerdì 20 luglio 2018 12:42
  • We are seeing VMs on our Hyper-V host crashed with smberror 30611 follow by 30906 as well. our storage is on sofs

    VMs are hanging and could not be migrated off. The only solution is to restart the Hyper-V Host so hanging VMs can be started on other nodes.

    Anyone has a solution will be greatly appreciated

    Thanks

    Qui 

     
    lunedì 27 agosto 2018 03:51
  • Hello Qui,

    We solved the issue by setting SMBMultichannel off on the HyperV hosts...

    Set-SmbClientConfiguration -EnableMultiChannel $false

    After changing this setting we did not experience the problem anymore..

    Microsoft will release a hotfix for the issue in September.. (needs to be installed on SOFS)

    Hope this helps!


    Jan

    lunedì 27 agosto 2018 09:11
  • Hi Jan,

    thanks for the tip, will try it out.

    do you know why multichannel is causing the problem?

    Thanks again

    Qui

    martedì 28 agosto 2018 04:51
  • Would you happen to have the MS link describing this issue?
    mercoledì 29 agosto 2018 20:24
  • There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.

    I cannot elaborate on this but will keep you posted when the hotfix is publicly available.


    Jan


    giovedì 30 agosto 2018 07:00
  • There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.

    I cannot elaborate on this but will keep you posted when the hotfix is publicly available.


    Jan

    giovedì 30 agosto 2018 07:00
  • Do you mean September 18 ?
    giovedì 30 agosto 2018 19:28
  • Yes, sorry, September 18 it is.

    Jan

    venerdì 31 agosto 2018 05:47
  • So now can you say what it is?
    giovedì 20 settembre 2018 16:09
  • I am interested in this SMB bug as well. After googling around I was unable to find anything specific regarding a bug or an upcoming hotfix.
    domenica 23 settembre 2018 16:36