Hi Fellow geeks ,
I've got my hand on a HP DL580G7 with 128GB of Ram - 2x Xeon E7 8837 2.67GHz - 5 SAS 146GB 15K - Windows 2012 Standard edition and a SQL server 2012 SP1 .
Disk 1 of bay 1 and disk 1 of bay 2 are mirrored together and have OS installed in them .
Disk 2 of bay 1 and disk 2 of bay 2 are mirrored and assigned two page file (system managed )
there is another disk that is free and used as a temporary file storage .
two Qlogic HBA with 4 Path provides 10 500 GB of storage from an EMC SAN .
so it begins :
everything is working good and normal .
some XYZ application says the SQL is down . pinging the sql server sometimes came up with replys .
after trying to connect to it from remote desktop it prompt for credentials and even (SOMETIMES) load the blue screen that says loading profiles and ... but it stuck here .. with the cursor moving and the waiting circle circling , but nothing happens and it goes on .. until a manual reboot .
after a reboot everything came back to normal but the most agonizing thing about it is that there not even a single event entities on event viewer . nothing in ILO events as well . no other sort of an error on HP system management as well .
this happens from time to time . maybe today and tomorrow , maybe the next time it happens next week and even 40 days apart .
there's nothing to know why this happens .
I've replaced the server with an aother server with the exact configuration but this happend two that one as well .
You're crying for me right ? wait .. there's more .. I have two of these servers .
yes by that i mean i've change two brand new servers with two other brand new servers . this happens to all of them .
firmwares - drivers - hp application - sql and windows are all up to date and power option is set to high performance .
I'm going crazy about this .
Do anyone have any idea on what side this problem rests ? Do you have any Idea that if this is a problem with SQL or Windows Or even Hardware .
please advise me to what course of action should I take ?
after that how's everyone opinion on that I install ESXi on these server and make it as dedicated host with all of it's resource to service to only one VM (W2012-SQL2012) to use the HA and DRS capabilities .
I've seen a similar thing on one of my test 2012 servers, in my case svchost.exe and VsTskMgr.exe (McAfee) were utilising a huge amount of memory, again a reboot sorted it and I've only seen it once, so haven't investigated too much yet I'm afraid. I was guessing WMI leaking was the likely culprit. Do you have any monitoring on these servers? Was memory utilisation increasing? If not maybe worth setting something up (even if it's capturing some perfmon data) before the next one.
A quick search turned these up which may be worthwhile installing:
Regarding your ESXi question if you have the physical resources and licenses/cash available it would be worthwhile for these feature benefits, a few points worth considering though:
- HA will not kick in if VMware tools are still responding even if the server/application isn't
- DRS will only kick in if the ESX host is low on resources and there is a better candidate in the vSphere cluster, again not if the application is not responding
- I see you have a fair amount of CPU/RAM, you would need a server with these resources free for HA/DRS to work
Hope this is of use to you!
Thank you for your reply , i don't think that is my issue .
we have a manageengine application manager to monitor our servers (this one as well)
but even before installing the monitoring services we've had this issue and it doesn't justify why no new event log .
I'll see to it that this doesn't happen either . thanks anyway .