I've noticed an issue with some of my virtual machines running on a particular host after that host was fully patched on Windows and HPE drivers/firmware from the 8.1 HPE service pack.
What would happen is VM's would get stuck in a stopping state for some reason. The host would all of a sudden try to move all the VM's from it but majority of them would get completely
stuck in a stopping state which results in my having to hard reboot the server.
At first I thought this may have been a Windows update issue with a bad patch, but after looking closer I think this is an issue with the HPE 554FLB adapter we are using:
2/12/2018:
The network interface "Emulex OneConnect 554FLB, Ethernet NDIS Miniport Driver" has begun resetting. There will be a momentary disruption in network connectivity while the hardware resets.
Reason: The network driver did not respond to an OID request in a timely fashion.
This network interface has reset 1 time(s) since it was last initialized.
2/1/2018:
The network interface "Emulex OneConnect 554FLB, Ethernet NDIS Miniport Driver" has begun resetting. There will be a momentary disruption in network connectivity while the hardware resets.
Reason: The network driver did not respond to an OID request in a timely fashion.
This network interface has reset 4 time(s) since it was last initialized.
According to the event viewer, Hyper-V was running fine with no issues until this warning occurred. When this warning occurred it followed with an error for each VM on the host:
A component on the server did not respond in a timely fashion. This caused the cluster resource 'SCVMM ********************' (resource type 'Virtual Machine', DLL 'vmclusres.dll') to exceed its time-out
threshold. As part of cluster health detection, recovery actions will be taken. The cluster will try to automatically recover by terminating and restarting the Resource Hosting Subsystem (RHS) process that is running this resource. Verify that the underlying
infrastructure (such as storage, networking, or services) that are associated with the resource are functioning correctly.
The firmware for the 554FLB is 11.1.183.23 and the driver on Server 2016 Datacenter is 11.1.196.4
Note: This is the only node in the cluster that is up to date with HPE drivers/firmware and Windows patches so we could test it first. This
is also the only node having this issue.
d