none
Logon process freezes on “Welcome screen” RRS feed

  • Question

  • Hello!

    I have 2 node Windows Server 2008 DE x64 cluster, which consist of two HP ProLiant DL360 G5 servers and HP MSA2000fc fiber channel massive. There are Hyper-V Cluster (6 guests), 3 SQL server clustered instances (default - SQL 2005 SE x64 and two SQL 2008 SE x64 named instances), File Server Cluster. Approximately two times a month I have the following situation: when I try to log on (through Remote Desktop or local console) - logon process freezes on “Welcome screen”. I move resources to second node using “Failover Cluster Management  console” and reset server ussually. After I reset a server and inspect the system log – I find message that “The previous system shutdown at … on … was unexpected”. But date and time is two or more days back. All that time from date of “unexpected shutdown” from system log and date/time when I reset server – system was functioning from client perspective.

     I think there is some problem with Hyper-V.

    Could you please help me to identify and resolve this issue?

    Aleksandrs

    Saturday, April 11, 2009 4:29 PM

All replies

  • So your configuration seems like it is fairly normal - just some clustered SQL and some VM guests. Are there any other errors showing up in the log before this event which are common to every time this occurs? There has to be a common thread to all of this; without more info we'd be taking a stab in the dark. So if you can give us some relevant log entries, we may be able to help more.
    Allan Hirt Blog: http://www.sqlha.com/blog Author: Pro SQL Server 2008 Failover Clustering (Apress - due out June, 2009)
    Saturday, April 11, 2009 11:45 PM
  • There is the similar situation ob both cluster nodes:

    ·         in System log the latest massage is: Level = Error; Source = DistributedCOM; Event ID = 10010; “The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register with DCOM within the required timeout.”

    ·         in Application log the latest massage is: Level = Error; Source = WMI; Event ID = 10; “Event filter with query "select * from __instancemodificationevent within 30 where targetinstance isa 'Win32_PerfFormattedData_PerfDisk_LogicalDisk' and targetinstance.PercentFreeSpace < 1 and targetinstance.Name != '_Total'" could not be reactivated in namespace "//./root/CIMV2" because of error 0x80041002. Events cannot be delivered through this filter until the problem is corrected.”

    I have discovered unusual behavior installing one of instances of SQL 2008 also. There was an error when setup was compiling mof files. After WMI researching I have discovered an error “no more threads could be created”. I decided to restart both nodes and repeat installation – all was OK.

    It seems like servers (OS) could not create new processes or threads because existing applications (SQL instances, Hyper-V guests) continue functioning.

    I can send any log file or other necessary information.

    Aleksandrs

    Sunday, April 12, 2009 8:02 PM
  • What it sounds like to me could be a few things potentially:
    1. Not sure if you have it configured or not, but MSDTC uses COM, so that may or may not be playing into things
    2. It sounds like something is trying to grab perf info (what do you use for monitoring?), and it's failing. So SQL and/or your VMs are consuming resources and not giving them up. What are you doing in your VMs and SQL? Are you using things like database mirroring?

    It sounds like your servers may be potentially underpowered and/or you have some configuration tweaks to make to have everything function properly.
    Allan Hirt Blog: http://www.sqlha.com/blog Author: Pro SQL Server 2008 Failover Clustering (Apress - due out June, 2009)
    Sunday, April 12, 2009 11:08 PM
  • Hello!

    1.       I have clustered MSDTC

    2.       I have MOM 2005 SP1 for monitoring, SCDPM 2007 SP1 for backup and SCVMM 2008 for virtual machine management.

    3.       SQL mirroring is not configured.

    4.       What I am doing in my VMs and SQL:

    a.       “SQLC1CLUSTER” SQL 2008 Standard Edition x64 named instance – 1C accounting application database (in production);

    b.      “SQLSPSCLUSTER” SQL 2008 Standard Edition x64 named instance – MOSS 2007 database (not actually used yet);

    c.       SQL 2005 SP3 Standard Edition x64 default instance – BizTalk 2006 R2 database (not actually used yet);

    d.      Hyper-V Guest1: Windows 2008 x32 domain controller (in production);

    e.      Hyper-V Guest2: Windows 2008 x32 – Terminal Services Session Broker and licensing (in production);

    f.        Hyper-V Guest3: Windows 2008 x64 – MOSS 2007 SP1 x64 application (not actually used yet);

    g.       Hyper-V Guest4: Windows 2003 x32 – BizTalk 2006 R2 application (not actually used yet);

    h.      Hyper-V Guest5: Windows 2008 x64 – SQL 2008 x64 Reporting with MOSS integration (not actually used yet);

    i.         Hyper-V Guest6: Windows 2008 x32 – IIS (not actually used yet);

    I have not done any configuration tweaks - all was installed and configured using Microsoft manuals and guides only.

    Aleksandrs
    Monday, April 13, 2009 6:32 PM
  • If you are having thread problems, it sounds like possible processor contention. You may want to look at the Process/Thread Count counter _Total vs. one for each process on the system to see what's consuming everything.

    Just curious - how much memory do you have in the nodes? There might be memory contention as well but it sounds more like processor to me.


    Allan Hirt Blog: http://www.sqlha.com/blog Author: Pro SQL Server 2008 Failover Clustering (Apress - due out June, 2009)
    Tuesday, April 14, 2009 1:41 AM
  • I have 32 GB of RAM on each node. SQL instances have maximum limit of 4 GB each. When all resources are on one node – there is 16 GB of RAM used.

    About threads - there is nothing unusual. I have two processes with biggest tread count:

    1.       \\CLUSTERNODE1\Process(hpsmhd#1)\Thread Count  = 257

    2.       \\CLUSTERNODE1\Process(System)\Thread Count = 160

    Total threads count is 1757. Summarizing all treads - the count is similar.

    I have no idea what to do. I think I will start incident with Microsoft as a Gold Partner.

    Allan, thank You for assistance!

    Best regards,

    Aleksandrs

    Tuesday, April 14, 2009 9:29 AM
  • Even with 32GB of RAM, you're running quite a bit of stuff there, and quite a bit of it is unused. This just seems like a classic capacity management issue.
    Allan Hirt Blog: http://www.sqlha.com/blog Author: Pro SQL Server 2008 Failover Clustering (Apress - due out June, 2009)
    Tuesday, April 14, 2009 9:46 PM
  • I'm having what seems to be exactly the same problem as you...

    Memory usage just climbs, wmiprvse.exe just keeps heading north.  Also on HP equipment to - but no clusters; just a simple SBS box.  Can I ask if Microsoft had any useful information for you?  (I'm not a Gold Partner!)

    I get exactly the same error's as you, and the frequency of the hang is quite irregular - a couple days to a few weeks.

    Any info you have on the fix would be greatly appreciated; I'm starting to get really annoyed with this one!

    Cheers,

    Gerard

    Monday, May 4, 2009 3:07 AM
  • Hi,

    Did you check that you are using the latest HP drivers on those machines. HP brings out hotfixes for drivers ( you can subscribe to it for free )

    Aleksandrs did you also checkout the disk IOPS ? You can easy check this also in hyper-V

    In your Configuration there are alot SQL DB so I assume you need a bunch off IOPS. There are several good sites where you can find info about calculating Disk IO.
    here are two links that can help you.

    http://msexchangeteam.com/archive/2004/11/03/251743.aspx

    http://www.databasejournal.com/features/oracle/article.php/3731961/Measuring-Disk-IOA-Vendor-View.htm


    Greetings, Robert Smit http://fiberman.spaces.live.com/
    Tuesday, May 5, 2009 10:50 AM
    Moderator
  • We are having similar situation, Three node cluster, that provides Hyper-V sessions, (55 Sessions). DPM was our backup. Shortly after the DPM starts its protection it causes active node to hang.  DPM Tech support has suggested breaking our backup into small protections groups. We currently have one protection group that protects all 55 sessions. We are changing it to 5 smaller jobs with 11 sessions staged at different times. We are also replacing our 4-74 GB hard drives that the OS and page file reside on to 5-500 GB, as our page file is extermely large. We have 160 GB of ram in each node of the cluster and active node uses about 74% of that phyiscall memory and page file is around 110 GB.
    Timothy Doerr
    Monday, December 14, 2009 4:33 PM
  • Hello!

     

    My problem was resolved after Windows 2008 SP2 installation. But I think MS recommendation about breaking DPM backup into small protection groups is correct.


    Aleksandrs
    Monday, December 14, 2009 7:00 PM
  • I was having the same problem when I logged into our Hyper-V hosts using Remote Desktop.  We also use DPM 2010 to backup our Hyper-V environment.  I ran it by a Hyper-V support professional that was working with me on a private hotfix for Hyper-V at the time.  He gave me this link which was exactly the issue.  http://support.microsoft.com/kb/982210
    Thursday, January 13, 2011 7:46 PM