locked
NIC causes VM's to get in a stuck state? RRS feed

  • Question

  • I've noticed an issue with some of my virtual machines running on a particular host after that host was fully patched on Windows and HPE drivers/firmware from the 8.1 HPE service pack.

    What would happen is VM's would get stuck in a stopping state for some reason. The host would all of a sudden try to move all the VM's from it but majority of them would get completely stuck in a stopping state which results in my having to hard reboot the server.

    At first I thought this may have been a Windows update issue with a bad patch, but after looking closer I think this is an issue with the HPE 554FLB adapter we are using:

    2/12/2018:

    The network interface "Emulex OneConnect 554FLB, Ethernet NDIS Miniport Driver" has begun resetting. There will be a momentary disruption in network connectivity while the hardware resets.
    Reason: The network driver did not respond to an OID request in a timely fashion.
    This network interface has reset 1 time(s) since it was last initialized.

    2/1/2018:

    The network interface "Emulex OneConnect 554FLB, Ethernet NDIS Miniport Driver" has begun resetting. There will be a momentary disruption in network connectivity while the hardware resets.
    Reason: The network driver did not respond to an OID request in a timely fashion.
    This network interface has reset 4 time(s) since it was last initialized.

    According to the event viewer, Hyper-V was running fine with no issues until this warning occurred. When this warning occurred it followed with an error for each VM on the host:

    A component on the server did not respond in a timely fashion. This caused the cluster resource 'SCVMM ********************' (resource type 'Virtual Machine', DLL 'vmclusres.dll') to exceed its time-out threshold. As part of cluster health detection, recovery actions will be taken. The cluster will try to automatically recover by terminating and restarting the Resource Hosting Subsystem (RHS) process that is running this resource. Verify that the underlying infrastructure (such as storage, networking, or services) that are associated with the resource are functioning correctly.

     

    The firmware for the 554FLB is 11.1.183.23 and the driver on Server 2016 Datacenter is 11.1.196.4

    Note: This is the only node in the cluster that is up to date with HPE drivers/firmware and Windows patches so we could test it first. This is also the only node having this issue.

    d

    Tuesday, February 13, 2018 3:18 PM

All replies

  • Hi ,

    Thanks for your posting here.

    Since we have no HPE drivers/firmware to test in our lab, I would suggest you contact HP forum for better answers. If you have further information during this period, you could post it on the forum, which help us understand and analyze this issue comprehensively.

    It is also appreciated that the other members in our forum can share their experience with us about this scenario.

    Best Regards,

    Candy


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Wednesday, February 14, 2018 6:30 AM