none
Management server is critical and no alerts are genered in console

    Question

  • I am new to SCOM 2016. During troubleshooting of SCOM 2016. I get the critical state of Management server. I have attached the screenshots of unhealty ms herewith.


    Done a alot of research about it but with no success..

    • Edited by admin07 Monday, May 7, 2018 7:59 AM information
    Monday, May 7, 2018 7:57 AM

All replies

  • Hi,

    can you please post some more information about why the Management Server is gray? You can click on it -> open the Health Explorer -> naviagte to the monitot on the bottom (the one which triggers the state change) and after you click on it, on the right pane you will see the State Change events. What do yiu see there when you click on the last state change?

    Can you post a screenshot?

    Please also do the following:

    Open the event viewer on your management server and check what kind of events are logged. Please post the related ones here.

    What alerts have been logged in regards to the Management Group? Posting those will help also. 

    Many thanks in advance!

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Monday, May 7, 2018 8:21 AM
    Moderator
  • Hi Stoyan,

    I have configure only one management server in my evironment. I am posting the screenshots as you have mentioned above.


    • Edited by admin07 Monday, May 7, 2018 9:44 AM
    Monday, May 7, 2018 9:42 AM
  • Hey,

    this looks suspiciously clean. Let's do the following:

    1. Empty Health Sercei Cache:

    - Stop the service "Microsoft Monitoring Agent" (Health Service).
    - Navigate to "X:\Program Files\Microsoft System Center 2016\Operations Manager\Server" and delete the folder "Health Service State"
    - Start the service and the folder will be recreated. 
    - The cache will then be reinitialized.
    - Check again for errors in the event log. what do you see there? Did the server turn green afterwards?

    2. Restart the SDK service

    Please do a restart of the SDK service and take a look at the events afterwards. Any new events?

    Thanks. Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov


    Monday, May 7, 2018 9:55 AM
    Moderator
  • Hi Stoyan,

    I have followed all this steps but did not solve my issue.


    • Edited by admin07 Monday, May 7, 2018 10:00 AM
    Monday, May 7, 2018 10:00 AM
  • Hi Stoyan,

    I have followed all this steps but did not solve my issue.


    Hi,

    what about the events, logged after the restart of:

    1. The Health Service
    2. The SDK

    Can you please post those also...

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Monday, May 7, 2018 10:09 AM
    Moderator
  • Hey,

    i have followed said instruction once again and posting the event log here.



    • Edited by admin07 Monday, May 7, 2018 10:19 AM
    Monday, May 7, 2018 10:10 AM
  • Hi,

    Is there any error event in logs please filter on for error events.

    And what your environment architecture like how many windows, Linux and network device you are monitoring.

    Since you have 1 MS only so all reporting instance and everything working on it.

    And your DB/DW is connected and healthy , UP as well. 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Monday, May 7, 2018 11:05 AM
  • Hi,

    it seems that there is some issue with SDK service account - "The data access service account might not have the requiremed permissions".

    Can you please esnure the SDK service account has local administrative rights on the management server. 

    In addition it should have been made member of the sdk_user role on the OpsMgr database:

    System Center Configuration Service and System Center Data Access Service account

    Please post back with the outcome. 

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Monday, May 7, 2018 11:17 AM
    Moderator
  • Hi,

    i need to monitor around 100 network device and 20 windows server and have not monitor linux os.

    The DB/DW is connected and healthy.

    Her is the screenshots that I get in my scom.

    Amit


    • Edited by admin07 Monday, May 7, 2018 11:22 AM
    Monday, May 7, 2018 11:22 AM
  • I am also sending you the screenshots of error that I get database. 



    • Edited by admin07 Monday, May 7, 2018 11:25 AM
    Monday, May 7, 2018 11:25 AM
  • I am also sending you the screenshots of error that I get database. 



    Seems this is not able to communicate with MS. Make sure this server is not in pending mgmt..

    Ans let me know the status of this server in monitoring pane... critical , healthy , warning?

    Or MS and DB are in same domain or differ domain?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Edited by GouravIN Monday, May 7, 2018 12:23 PM
    Monday, May 7, 2018 11:49 AM
  • Hi,

    i need to monitor around 100 network device and 20 windows server and have not monitor linux os.

    The DB/DW is connected and healthy.

    Her is the screenshots that I get in my scom.

    Amit


    This is not a big reason... this has 20070 event id on it.

    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Monday, May 7, 2018 11:50 AM
  • Hi,

    I have given the SDK service account local administrator right both in ms and database server.


    Mandal Amit

    Monday, May 7, 2018 2:05 PM
  • Hi,

    I have given the SDK service account local administrator right both in ms and database server.


    Mandal Amit

    Hi Amit,

    As per the best practices all the 3 services should be running using domain account and have admin over SCOM servers.

    And your server ( DB , i guess ) which have this 20070 event id, Is reflecting healthy in SCOM console?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 8, 2018 4:03 AM
  • Hi Gaurav,

    All the services using local system account and I have given domain administrator account as local admin rights both in SCOM server and DB server.


    Amit


    • Edited by admin07 Tuesday, May 8, 2018 4:16 AM
    Tuesday, May 8, 2018 4:15 AM
  • Thanks Amit, This issue came after this modification or earlier?

    If this issue occurred after doing this then i thought you missed some steps, we need to give permission on DB/DW and console and SCOM a well. Suggest you to follow this link

    You will get some light over issue and hope it will fix it, if the issue really belongs to this bucket. 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 8, 2018 4:35 AM
  • Thanks Gaurav, 

    The scom and database server both are using local account  in all services during the deployment time. 

    Do I need to change the all the services to domain account.

    While changing the service do i need to configure in SQL operation manager  in user mapping.



    Mandal Amit

    Tuesday, May 8, 2018 5:07 AM
  • Yeah Amit, Thats why i focused on it.

    Because Once we need to change the account, we need to map user roles in SCOM side and as well database side. Because these account are responsible for all communication from SCOM to DB/DW using kerberos or X905 authentication. 

    So check the link which i provided you above and MAP all the rights in your SCOM.

    Hope this helps :)


    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Edited by GouravIN Tuesday, May 8, 2018 5:21 AM
    Tuesday, May 8, 2018 5:21 AM
  • Thanks Gaurav,

    I will configure all the services according and inform you accordingly.

    Thanks once again.


    Mandal Amit

    Tuesday, May 8, 2018 5:23 AM
  • No need of thanks Sir, 

    Yeah! do update us about this.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 8, 2018 5:52 AM
  • Hi Gaurav,

    Due to sudden work i need to go my hometown so unable to update the issue of scom.

    The management server is still in critical state and health state of management group function is greyed out.

    Please find the screenshots herewith.


    Mandal Amit

    Tuesday, May 15, 2018 5:58 AM
  • Hi Amit,

    Is there any error event on MS and DB/DW. And apart from this i suggest stop all the three services and change the name of health service folder on MS and clear cache of SCOM console as well. you can refer those links:-

    http://systemcenter.no/?p=215

    https://om2012.wordpress.com/2013/06/01/opsmgr-2012-how-to-clear-the-cache-in-the-opsmgr-console/ 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 15, 2018 6:29 AM
  • Hi Gaurav,

    I have followed all the steps suggested above but with no success. and also i didnot find any error in my DB/DW.

    If possible i can provide you the remote my pc.


    Mandal Amit


    • Edited by admin07 Tuesday, May 15, 2018 7:34 AM
    Tuesday, May 15, 2018 7:12 AM
  • Hi Gaurav, 

    I have also seen this error in operation manager server. 


    Mandal Amit

    Tuesday, May 15, 2018 7:49 AM
  • Hi Amit,

    Please confirm earlier MS was healthy?

    Have you configured/Map the account in DB and DW. And i suggest please revert the changes in Local account. And start making changes one by one, first map and configure the account into DB/DW and then perform next step.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 15, 2018 8:07 AM
  • Hi,

    this looks more like a consequence of the issue, not like a cause. Can you please right click on the event log and set a filter to display only errors. Please post the the other errors you see in the log after applying the filter. Let's see what elese is there. 

    Thnaks and regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Tuesday, May 15, 2018 8:21 AM
    Moderator
  • Hi Gaurav,

    Earlier MS was also in same state. 

    Yes I have configured/map the account in DB and DW as well.


    Mandal Amit

    Tuesday, May 15, 2018 8:23 AM
  • As Stoyan suggested please refer only error event id and one more thing. Check the health explorer of MS and see if you got something. And one more thing please check connectivity between DB/Dw of MS if there any firewall blockage or etc.

    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 15, 2018 8:34 AM
  • Hi Gaurav, I get error event id 26319 during when i recalculate All Management Server Resource Pool in management group function.

    Please find the screenshots of the error that i get.


    Mandal Amit

    Tuesday, May 15, 2018 10:17 AM
  • Hi Amit,

    If i am not wrong you have only one MS ,right?

    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Edited by GouravIN Tuesday, May 15, 2018 10:30 AM
    Tuesday, May 15, 2018 10:29 AM
  • Yes gaurav,

    I do hav only one MS server.


    Mandal Amit

    Tuesday, May 15, 2018 10:37 AM
  • Amit, 

    Is there any firewall between DB/DW and MS or are you sure communication is happening within a given time.

    Or Is your MS build on VM?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 15, 2018 10:41 AM
  • yes there is firewall between DB/DW and MS server. 

    and my MS and database are build on different VM.

    Yes I have checked and communication is happening within a given time. Please tell me how can i Verify that my scom is communicating with Database so that i can send u the screenshots for your information.


    Mandal Amit

    Tuesday, May 15, 2018 10:51 AM
  • Finally, it looks i am close to it.

    "You can try turning off the firewalls between the SCOM Management Server, and the endpoint of DB/DW, to ensure connectivity/communications is working first; and then turn one firewall on (with the necessary ports opened).

    And run the below command on MS servers' CMD 

    Telnet DBserveranme 1433


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Tuesday, May 15, 2018 10:55 AM
  • Hi Gaurav,

    As you said i can telnet the DB server port from the ms.

    Done everything but with no success.


    Mandal Amit


    • Edited by admin07 Wednesday, May 16, 2018 7:21 AM
    Wednesday, May 16, 2018 6:22 AM
  • It looks everything is fine now,

    • you have map accounts into DB/DW
    • MS can telnet DB
    • DB is running and UP and MS too
    • No Error event in MS except 26319

    Now could you please confirm what is the ping response from MS to DB. How much time is it taking to make first ping request. Is it more then 1 seconds or instant after hitting enter key.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 7:25 AM
  • It takes 1ms after hitting the enter key.

    there no any error in scom server. 

    Im attaching the screenshots of DB server.


    Mandal Amit


    • Edited by admin07 Wednesday, May 16, 2018 7:39 AM
    Wednesday, May 16, 2018 7:32 AM
  • All looks fine could you please try to repair Operations manager installation. 

    Before doing this take a Backup of MP , snapshot backup of server.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 7:42 AM
  • Hi Gaurav,

    Should I do repair from Installation media of operation manager.


    Mandal Amit

    Wednesday, May 16, 2018 7:59 AM
  • Yeah, open SCOM installation media on MS and then run the setup. You will get repair option like me:-

    Step 1:-

    Step 2:-

    Note:- you will run this ADMIN account on server as well admin in SCOM and should also have sydadmin on DB. ( preciously a domain account with all these things ).


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 8:13 AM
  • Hi garurav Done repair of operation manager installation but with no succes.

    The issue are not solved yet and did not get new notification since last two weeks.


    Mandal Amit


    • Edited by admin07 Wednesday, May 16, 2018 11:00 AM
    Wednesday, May 16, 2018 10:47 AM
  • Opps Amit....

    <<<The issue are not solved yet and did not get new notification since last two weeks.>>> what you are referring here sorry i did not get it.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 11:05 AM
  • the issue of management server is in critical state is not solved.. and the new state change of my monitoring network device is not showing in alerts.

    Mandal Amit

    Wednesday, May 16, 2018 11:08 AM
  • Please find the screenshot of active alerts of my scom server.

    I need to solve this issue asap.


    Mandal Amit


    • Edited by admin07 Wednesday, May 16, 2018 11:15 AM
    Wednesday, May 16, 2018 11:13 AM
  • Could you please confirm all the services running on MS, SQL services and SSRS as well. 

    if yes, delete health folder after stopping the services and examine will there be new config file after some time or not? 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 11:27 AM
  • Hi Gaurav,

    the health state folder get created after stopping the microsoft monitoring agent and management configuration service. but the config service file is not created. I am posting the screenshots herewith.


    Mandal Amit

    Wednesday, May 16, 2018 11:45 AM
  • Hi gaurav, If you are free then I may give you remote access to my server. my email id is amit.mndl@outlook.com. You can email me for further information.

    Mandal Amit

    Wednesday, May 16, 2018 11:47 AM
  • Amit,

    Hold on! These files will not be deleted and created after restarting the services and do not delete these files as well. (see my screenshot below).

     

    Only you need to check file "opsmgrConnector.config" this should be undated and re created. 

    To check this follow this below:-

    C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Health Service State\Connector Configuration Cache\yourManagementgroupname

     


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    Wednesday, May 16, 2018 3:04 PM
  • Hi Gaurav as per your guide  i search the config file but did not find in management group.


    Mandal Amit

    Thursday, May 17, 2018 6:04 AM