We have a windows server 2012 as a hyper v host and 2 VM S SBS 2011 and windows 7 VM
Since this Tuesday the CPU usage is a constant 100% on both VMs and since backups were failing due to high cpu usage rebooted the SBS 2011 server. But even after a reboot/shutdown SBS is at a constant 100% usage
We disabled all non MS services, disabled Exchange, SQL and memory hungry services but the CPU usage is at a 100%.The time it goes down is when we stop all MS /NON MSservices except bare minimal services which needs to run the SBS properly
Even with this the moment we open up a MMC console , open up an application on the server the cpu goes to 10% and remains for a long time and then goes down again
But with all SBS services running it never happens and stays at a 100% even after acouple of hours
Hyper V integration services are up-to-date
Any advice is greatly appreciated.
- Edited by PCS-Support Friday, August 16, 2013 7:51 AM
When the CPU usage at the high level, you can use the resource monitor to locate which process cause it and check the related issue.
Using Resource Monitor to Troubleshoot Windows Performance Issues Part 1
Hope this helps.
Thanks for your reply.
The issue is its not just one or a group of processes/services.For an example initially it was Symantec Mail security related processes/services.So we disabled those services but after a reboot a new set of processes/services were using the CPU cycles.If we disable those then another processes or services takes over.Sometimes even taskmanager uses 30-40% of CPU.Also its not a single process or a service.Its always at least 3-4 services collectively using all available CPU cycles.
We managed to bring the CPU usage down to 50-60 percent by disabling all non essentials (SQL instances , windows search , Windows internal database etc ) on the SBS server VM as well as on the Hyper V server but as mentioned earlier when ever a new MMC , application or even when the start button is clicked on the SBS server CPU usage goes to 100%
Base on my experience it maybe cause by the virus or common system file resource contention or the system disk have bad block, could you post some detail system log when the issue occur.
There is no time frame and if we enable all services on the SBS and Hyper V it keeps using cpu forever.
There is nothing unusual on any of the event logs .
At this stage it could be related to common system file resource contention because of the nature of the issues , because its impacting all VM s and because th CPU load goes down when some services are disabled.
Any thought on how to get more details since there is not much help rom the event viewer?
Host is having 32GB of RAM. 16GB assigned to the SBS VM and 4GB assigned to the windows 7 VM.
Memory is not allocated dynamically.
SBS VM has fixed VHD drives ( C attached to IDE and the size is 200GB and 100 GB of free disk space , D attached to SCSI controller and its 400GB and 200GB remaining free). windows 7 has one differencing VHDX drive.
CPU on the host is a Intel Xenon CPU E5-2430.0 2.20GHz
The problem you have is one of 2.
Firstly, i smell virus on your server/VMs - I will power off the exsisting VMS, then quickly create a fresh Win7 VM on Hypv ,power it on and monitor the CPU utilization
Secondly, I also think its a hardware problem. Check that your physical server is not too hot /The internal FAN is still functioning (This might be the problem)
Dell server administrator is not detecting any hardware issues from the very first day which this issue occured.
Also I dont think this is an infection because since Friday afternoon everything appears to be back to normal.No cpu spikes or constant CPU usage.Enabled all disabled services on the SBS server and rebooted the other VM but no issues till now.
We are yet to reboot the SBS VM and the Hyper V itself but we just want to keep the SBS running as it is for the next few days and check the state.
We are having an almost exact problem, including using a Dell server. The server is a Dell T320 with a H710 caching RAID controller with 15k SAS drives.
My vote is for a hardware problem. We have the 'luxury' of having had this same box used as a 2008R2 Hyper-V host with a SQL server VM which had slowness issues but nothing that we could put our finger on. We tested and tested some more with no discernible hardware problems.
Finally we were under so much pressure to do something, we reversed the roles with another T310 which was just running SBS 2011 physical, putting Server 2012 with the Hyper-V role to the T320 along with SBS and the Terminal Server as VMs. We reformatted the T310 and installed SQL server on it. Now the server that had SBS on it before is running SQL with no issues while the T320 which had issues with SQL is now having issues as described above with SBS. Even the RDS server is using 70% of CPU at rest.
It doesn't matter which service is stopped the others expand to fill the void. It is like removing a balloon from a box of balloons! You still have a full box...
It feels like something might be wrong with the processor or the Hypervisor bit (ring 0) implementation. I am at the point of calling Dell and exercising the Lemon Law. We all know how easy that will be.
I just wish there was some way to troubleshoot a problem like this.
Many thanks for letting everyone know your experience and I can understand your frustration, we went through the same.
Ours is also a Dell T420 PE so that is a common denominator is this issue.
The server in question belongs to one of our client and we did not have a similar spec ed server so we could not perform what you have but it’s really intriguing that even after a format and installation of a complete different OS that the same issues remain. I too believe it’s hardware related but I guess it could be a combination of hardware and windows 2012.
We were about to call MS for support but the issue suddenly vanished and sbs and other VM is running fine for a couple of weeks now.
As you have mentioned wish there is a better way to troubleshoot issues like this, especially the servers in production.
I am experiencing the exact same issue. In my case the host and guests are 2008R2. All was running fine until last week when users began to mention slow performance in some of virtual machines. I have four virtual machines (all 2008R2 except for one XP machine, but that was not running during this testing). At first it was just one machine (an Exchange 2010 server). I suspected a virus or outside attack, but firewall blocking and scans have ruled that out.
I opened a ticket with MS, but specific to the one virtual machine. They don't have an answer yet, but asked me several times if it was limited to the one machine -- at the time it seemed to be, but now a second virtual machine is experience the same. As Dhanushka said, the high CPU processes seem irrelevant -- you kill or disable one and another, seemingly legitimate process takes it's place.
I restored from from a backup to a VMWare workstation and the machine was fine (10% utilization). I restored the same backup to my Hyper-V host again again 100% utilization. I am confident this is something in the Hyper-V, possibly driver or update related. If anyone has anymore information, these are production machines and any help is appreciated.
Sorry to hear that you are experiencing the same nightmare as myself.
The physical server , is it a DELL ?If the answer is yes then we may have a common denominator. Also is the firmware on the DELL server up to date? If you can get the support from the hardware vendor and check if firmware bug on RAID/SAS/SCSI etc. could be causing the problem.
I do not have any instructions or advise because till now we don't know how it occurred or how it got resolved. We managed to bring the system to a workable state by disabling all but absolute minimum critical system related services on both the host and the guests. I don't know if that's a possibility on your situation or if you already exhausted that without any success.
But unlike myself you are in a better position since you have an open ticket with MS. Do let us know how things proceed and the actions taken to resolve the issue.
Best of Luck.
Yes, in fact I am running on a Dell PowerEdge R520. Three of the four virtual machines have been running for almost one full year, however. The problem only occurred fairly recently. I need to check -- it is possible that I recently made some updates -- I will check into that. This weekend will be my big troubleshooting weekend - I'll need to get it stabilized somehow by then.
I believe I have solved my problem. Yesterday I stopped my virtual clients and focused on the host - a Dell PowerEdge R520 with a PERC H310 mini RAID controller and dual Broadcom NICs. There were no significant errors in the Windows logs nor any in the Dell OpenManage Administrator. Although not terribly out of date, I chose to upgrade the BIOS, NIC Firmware, NIC Drivers, PERC Firmware and PERC Drivers. In addition, although up to date, I reinstalled the Chipset drivers.
After two restarts I fired up one of my virtual clients -- a 2008R2 Server running SQL Server 2008R2. It was running at about 60% CPU before the issue, but was sluggish. Idle CPU usage is now averaging about 2%.
I then started my Exchange server (2008R2/2010) which had been pegging at 100% with all services enabled. It is now averaging about 3% idle. Lastly, I started my 2008R2 Remote Desktop Server which had also been running at 90 to 100% and was extremely slow. Also using about 2% now.
So, after 24 hours, looking like we are back to normal. Unfortunately, I did could not afford the downtime to check the fixes one by one. My guess is it was either the RAID (PERC) firmware or drivers or maybe the NIC firmware/drivers. I also don't know why it started... possibly a Microsoft update that was incompatible with a Dell driver? My suggestion is to ensure you have the latest firmware *and* drivers for your Hyper-V host. You might also trying reinstalling the Integration Services on the client, but in my case it seemed to be host-based.
- Proposed as answer by MrSeanK Sunday, September 22, 2013 5:03 PM
Great to hear and thanks for shareing your experience.
I too believe that there may be an incompatability with windows and dell firmware for this issue to occur in the first place but having said that how my issue got resolved without any firmware update is a complete mystery.
I had very similar issues and I can also say that this was a fix for me. Dell T420. I did everything but the RAID(bios, chipset, network, firmware/drivers)
I've had this issue ongoing for a while on my T620. We did the iDRAC update, BIOS update... worked for 3 weeks or so and bam, high CPU on the virtual machines.
Last night, Dell had me go into the BIOS and change the Performance Per Watt (DAPC) to just PERFORMANCE and the tech was almost certain this will totally resolve it. Will let you know.
Hello. Not sure if you saw my post from Sunday, September 22, 2013 5:02 PM above, but my issue was fixed after the stated updates. In my case I did not change in BIOS settings, but did update the BIOS -- which could have been the fix. If you don't want to go one by one, I would update all the drivers and firmware I list as we've never seen the problem come back. Good luck and let us know how you come out!
I've experienced this issue on several servers and changing that setting had no effect. We had noticed that if we drained the flea voltage by pulling the power cords and holding the power button for 20-30 seconds the servers would be fine for a month or so. Our problems cleared up for good after upgrading the iDRAC firmware to 1.57.57 or higher. See http://www.dell.com/support/article/us/en/04/SLN290366/EN which seems to corroborate our findings.