none
Error 2912 No more threads

    Question

  • Hi folks,

     

    Is this a known error? One of my hosts is showing as "Needs Attention" with this error alongside the Refresh Host - System Job Command:

    Error (2912)

    An internal error has occurred trying to contact an agent on the hyperv1.ms.domain.local server.

    (No more threads can be created in the system (0x800700A4))

    Recommended Action

    Ensure the agent is installed and running. Ensure the WS-Management service is installed and running, then restart the agent.

     

    There's no pattern as to when I get this show up in the VMM console, whether the console is on my local workstation on or the VMM08 server. The host in question is a Server 2008 Enterprise box with 8 CPU cores, 20GB RAM, 300GB disk ... no errors in the event logs on either the host or VMM server.

     

    I've seen that error (0x800700A4) in Google searches, but don't see any circumstances which can possible relate to this environment. If I stop the WinRM service, which stops the VMM Agent, then restart the WMI service (and all the dependent services) then start VMM, it shows up just fine. The HyperV and VMM servers are fully patched.

     

    Any clues?

     

    Thanks

    Tony

     

     

     

    Monday, September 15, 2008 5:42 PM

Answers

  • Hi Tony,

    I somewhat suspected this. If it is a problem due to saturation of WinRM then it would be caused when multiple jobs are running. You could try starting up multiple host-level jobs, say some creating new vms, cloning, updating properties, attaching iso files, migrating vms. All of these will create management overhead and as such will require communication on the status of the job, if you have any sort of weakness in this area it will become apparent.

    -matt
    Wednesday, September 17, 2008 2:39 PM

All replies

  • Hi Tony,

     

    This sounds like it could be related to performance, how does your performance on the box look?  Occasionally WinRM will not be able to get any traction if the box is overloaded.  Based on the specs you described the likely candidate is your disk.  Can you please describe how your disks are configured.  The second possibility would be your network.  And third being your processor.

     

    -matt

    Monday, September 15, 2008 6:41 PM
  • Hi Matt,

     

    Thanks for the note.

     

    Performance on the box is good - ran performance logging on it for the week last week, and there aren't any bottlenecks.

     

    2 x Quad 2.8GHz Xeon CPUs

    20GB ECC RAM

    2 x 72GB 15k RPM SCSI U320 Disks as RAID1 for O/S

    6 x 72GB 15k RPM SCSI U320 Disks as RAID5 plus Hot Spare for Data

    4 x 1Gbps NICs

    Roles: Hyper-V

    Features: SNMP

     

    The five virtual machines on this host are extremely low priority machines - one development server, two developer test servers, one depreciated server running Lyris list manager, and one server with WSS3.0 for an IT blog which doesn't even get one post per day.

    The five VMs have one CPU assigned each, and as a whole they use 6.5GB of RAM total. 215GB of the 340GB hard drive space is available. All five of these VMs were migrated from a smaller server running Virtual Server 2005.

     

    Obviously with SCVMM08 being beta we wouldn't be managing any major servers with it, and I'm hoping this is something we can get sorted by the RTM version (like the frequent console crashes, etc), because I *really* like SCVMM08!

     

    Further thoughts? Appreciate the help.

     

    Thanks

    Tony

     

    Monday, September 15, 2008 8:02 PM
  • Hi Tony,

    You shouldn't be seeing any performance issues.  How are the 4 x 1GB interfaces configured?  Are you doing any teaming, are you splitting the load over them?  How is all that handled?

    Other than that it looks like it might just be saturation of the protocol.  Are you seeing the WinRM (WS-Managment) service failing or restarting?

    -matt
    Monday, September 15, 2008 8:13 PM
  • Hi again Matt,

     

    The four NICs have have been teamed using 802.3ad, with their TOE and Large Send Offload disabled. The performance of the VMs themselves has been stellar when users have been connected, and I never see the WMI or WinRM services indicating any kind of error.

     

    The thing that's odd is that I have another server with the lower hardware levels (16GB, 8CPU, 200GB Disk, 2x1Gbps), six busier VMs, and not a single problem with it.

     

    Thanks

    Tony

     

    Monday, September 15, 2008 8:56 PM
  • Hi Tony,

    I am sorry I was not being clear (I am a bit scatter-brained today).  When I said that it looks like it might be the WS-Managment WinRM protocol that is saturating it I meant on the SCVMM server, not the VM host.  Can you describe your hardware on your SCVMM server as well as your environment - including how many hosts, how many VMs etc.


    -matt
    Monday, September 15, 2008 9:05 PM
  • Hi Matt,

     

    Ah! No problem.

     

    The SCVMM server is virtual, and living on the host that is not having this problem. The SCVMM server is single processor, 2GB RAM, 2 x 60GB virtual disks, 1 x FE NIC.

    The event log for the SCVMM server show no errors or warnings related to WinRM, WMI, VMM - except VMM service crashes as sometimes happen (the service is set to restart on failure).

    We have a total of 2 hosts and 11 VMs. All VMs at this point in time are development, testing, or lab orientated, in other words only a handful of users each at any given time.

     

    Thanks

    Tony

     

    Monday, September 15, 2008 9:23 PM
  •  

    One other thing ... if it's the SCVMM server that's having the issue, how come restarting WinRM/WMI/HyperV on the problem server clears up the status? Hm.

     

    Thanks

    Tony

     

    Monday, September 15, 2008 9:26 PM
  • Hi Tony,

    Can you reproduce the issue?  In other words does this happen only on certain actions?
    Also please describe the hardware on the host behind the VMM server?
    How does the CPU on the SCVMM server as far as utilization look?

    -matt
    Monday, September 15, 2008 9:55 PM
  • Good morning Matt, and thanks for your continued assistance.

     

    I can't reproduce the issue at will, as I don't know what triggers it. For example three times yesterday when I started the VMM console the server showed up as Needing Attention, but just now when I opened the console for the first time this morning the server shows as OK.

     

    The host the VMM server is on is a 2 x Quad Core 2.8GHz, 12GB RAM, 400GB HDD, Dual Gb NICs, fairly quiet VMs.

     

    I'll run a performance trace on the SCVMM server this morning and email back in four hours to let you know about the CPU.

     

    Thanks

    Tony

     

     

    Tuesday, September 16, 2008 1:05 PM
  • Hi Tony,

    What happens when you force a VMHost-Refresh on that host?  Do you know how to use powershell?

    -matt
    Tuesday, September 16, 2008 4:40 PM
  • Hi Matt,

     

    CPU usage on the SCVMM server for the last four hours has averaged a little over 1%, with spikes up to 10% usage.

     

    When the server is showing as Needs Attention and I try refreshing the host I usually get:

    Error (2606)

    Unable to perform the job because one or more of the selected objects are locked by another job.

    Recommended Action

    To find out which job is locking the object, in the Jobs view, group by Status, and find the running or canceling job for the object. When the job is complete, try again.

     

    But there is no active job listed in the Jobs view.

     

    I'm comfortable with PowerShell, but lazy and tend to opt for the UI Smile  If you have any other suggestions for the next time the machine is in Needs Attention state I'll be happy to try them out.

     

    Thanks

    Tony

     

    Tuesday, September 16, 2008 5:17 PM
  • And of course this morning there's no "Attention Needed" here either. It's making a liar out of me ...

     

     

    Wednesday, September 17, 2008 12:25 PM
  • Hi Tony,

    I somewhat suspected this. If it is a problem due to saturation of WinRM then it would be caused when multiple jobs are running. You could try starting up multiple host-level jobs, say some creating new vms, cloning, updating properties, attaching iso files, migrating vms. All of these will create management overhead and as such will require communication on the status of the job, if you have any sort of weakness in this area it will become apparent.

    -matt
    Wednesday, September 17, 2008 2:39 PM
  • Though I have not found the root cause, but I did find something of a workaround that quickly brings the Hyper V host back to a functional state.

    When I see the error of :

    Error (2912)
    An internal error has occurred trying to contact an agent on the servername.domain.com server.
    (No more threads can be created in the system (0x800700A4))
    Recommended Action
    Ensure the agent is installed and running. Ensure the WS-Management service is installed and running, then restart the agent.

    I go to Services and Restart the "Windows Management Instrumentation" Service. Restarting this services also restarts the following services: (if they are on your server)

    • Hyper V Virtual Machine Management
    • Virtual Machine Manager Agent
    • Hyper-V Image Management Service
    • Hyper-V Networking Management Service
    • IP Helper
    • EMC PowerPath Service 5.1.2
    • SMS Agent Host

    Due to the type of services that are also restarted when doing this, and if the Hyper V host is in production. I would suggest doing this with caution and sending a user awareness notification for the temporary outage. Though the outage is small depending on how long it takes for certain services to start.

    Because the Hyper-V Image Management Service is restarted. Users will not be able remote control a virtual machine or may be kicked off the VM remote control session. And if you are doing this while TS into the Hyper V host server, you may lose TS connectivity momentarily.

    Friday, January 30, 2009 8:09 PM
  • Hi there,

    Maybe you already figure it out by now.

    Just found the article bellow

    http://support.microsoft.com/kb/981314

    that suggests that the issue (WMI class leaks memory) you are encountering may be related.

    Regards,

    Marius


    Virtual Orienteering - Find your way!

    Monday, November 10, 2014 9:28 AM