none
Specific Management Server causes Linux heartbeat failures RRS feed

  • Question

  • Hi all.

    We have an environment with 2 management servers and whenever one of the management servers is added to a Unix/Linux resource pool, any Unix/Linux client server that attempts to connect to that server begins to suffer heartbeat failures.  Here are the steps/information that have been gathered so far:

    • Current SCOM environment is SCOM 1807.  The linux servers are Ubuntu 14 and 16.  Management packs are v7.7.1136.0
    • Certificates have been exchanged from all SCOM servers - this has been double checked
    • Putting the offending management server into its own resource pool and using it to install the SCOM agent/configure a Linux/Unix server will result in the server being configured and discovered, but then after about ~5 minutes the heartbeat begins to fail.
    • Moving the monitored client server from the situation described above to a resource pool without the offending management server results in the client heartbeating again.  Moving it back to the original resource pool with the broken management server will cause heartbeats to fail again.
    • The problem management server has its firewall turned off and it also has a bunch of Windows client agents using it as their primary management server without issue.  I can also ping from the Linux/Unix servers to this problem management server without issue.

    Could you help me?

    I already tried

    winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_Agent?__cimnamespace=root/scx -username:<UNIX/Linux user> -password:<UNIX/Linux password> -r:https://<UNIX/Linux server>:1270/wsman -auth:basic -encoding:utf-8

    and worked with no issues.

    Thanks!

    Monday, September 9, 2019 7:19 PM

Answers

  • Hi Alex,

    As you suggested, I created 3 Resource Pool, to test with both servers and only only.

    Now, it is working using any Resource Pool!

    It is weird, but as I was working with Microsoft in a support case, I had to:

    * Reset the proxy - "netsh winhttp reset proxy"

    * Restart the Microsoft Monitoring Agent service.

    I believe those steps solved the issue :)

    Thank you very much for your help.

    Wednesday, September 11, 2019 10:46 AM

All replies

  • Hi,

    Based on the above description, it seems there is no network and authentication problem. Are the offending management server and other working managerment servers in the same resource pool? Or they each belong to a dedicated resource pool?

    We may add the offending management server to the same resource pool as other working management servers to see if it works.

    Hope the above information helps.

    Regards,

    Alex Zhu
    -----------------------------------------------
    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.
    Tuesday, September 10, 2019 1:03 AM
  • Hi Alex,

    As you suggested, I created 3 Resource Pool, to test with both servers and only only.

    Now, it is working using any Resource Pool!

    It is weird, but as I was working with Microsoft in a support case, I had to:

    * Reset the proxy - "netsh winhttp reset proxy"

    * Restart the Microsoft Monitoring Agent service.

    I believe those steps solved the issue :)

    Thank you very much for your help.

    Wednesday, September 11, 2019 10:46 AM