none
Management server is critical and no alerts are genered in console

    Pregunta

  • I am new to SCOM 2016. During troubleshooting of SCOM 2016. I get the critical state of Management server. I have attached the screenshots of unhealty ms herewith.


    Done a alot of research about it but with no success..

    • Editado admin07 lunes, 07 de mayo de 2018 7:59 information
    lunes, 07 de mayo de 2018 7:57

Todas las respuestas

  • Hi,

    can you please post some more information about why the Management Server is gray? You can click on it -> open the Health Explorer -> naviagte to the monitot on the bottom (the one which triggers the state change) and after you click on it, on the right pane you will see the State Change events. What do yiu see there when you click on the last state change?

    Can you post a screenshot?

    Please also do the following:

    Open the event viewer on your management server and check what kind of events are logged. Please post the related ones here.

    What alerts have been logged in regards to the Management Group? Posting those will help also. 

    Many thanks in advance!

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    lunes, 07 de mayo de 2018 8:21
    Moderador
  • Hi Stoyan,

    I have configure only one management server in my evironment. I am posting the screenshots as you have mentioned above.


    • Editado admin07 lunes, 07 de mayo de 2018 9:44
    lunes, 07 de mayo de 2018 9:42
  • Hey,

    this looks suspiciously clean. Let's do the following:

    1. Empty Health Sercei Cache:

    - Stop the service "Microsoft Monitoring Agent" (Health Service).
    - Navigate to "X:\Program Files\Microsoft System Center 2016\Operations Manager\Server" and delete the folder "Health Service State"
    - Start the service and the folder will be recreated. 
    - The cache will then be reinitialized.
    - Check again for errors in the event log. what do you see there? Did the server turn green afterwards?

    2. Restart the SDK service

    Please do a restart of the SDK service and take a look at the events afterwards. Any new events?

    Thanks. Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov


    lunes, 07 de mayo de 2018 9:55
    Moderador
  • Hi Stoyan,

    I have followed all this steps but did not solve my issue.


    • Editado admin07 lunes, 07 de mayo de 2018 10:00
    lunes, 07 de mayo de 2018 10:00
  • Hi Stoyan,

    I have followed all this steps but did not solve my issue.


    Hi,

    what about the events, logged after the restart of:

    1. The Health Service
    2. The SDK

    Can you please post those also...

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    lunes, 07 de mayo de 2018 10:09
    Moderador
  • Hey,

    i have followed said instruction once again and posting the event log here.



    • Editado admin07 lunes, 07 de mayo de 2018 10:19
    lunes, 07 de mayo de 2018 10:10
  • Hi,

    Is there any error event in logs please filter on for error events.

    And what your environment architecture like how many windows, Linux and network device you are monitoring.

    Since you have 1 MS only so all reporting instance and everything working on it.

    And your DB/DW is connected and healthy , UP as well. 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    lunes, 07 de mayo de 2018 11:05
  • Hi,

    it seems that there is some issue with SDK service account - "The data access service account might not have the requiremed permissions".

    Can you please esnure the SDK service account has local administrative rights on the management server. 

    In addition it should have been made member of the sdk_user role on the OpsMgr database:

    System Center Configuration Service and System Center Data Access Service account

    Please post back with the outcome. 

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    lunes, 07 de mayo de 2018 11:17
    Moderador
  • Hi,

    i need to monitor around 100 network device and 20 windows server and have not monitor linux os.

    The DB/DW is connected and healthy.

    Her is the screenshots that I get in my scom.

    Amit


    • Editado admin07 lunes, 07 de mayo de 2018 11:22
    lunes, 07 de mayo de 2018 11:22
  • I am also sending you the screenshots of error that I get database. 



    • Editado admin07 lunes, 07 de mayo de 2018 11:25
    lunes, 07 de mayo de 2018 11:25
  • I am also sending you the screenshots of error that I get database. 



    Seems this is not able to communicate with MS. Make sure this server is not in pending mgmt..

    Ans let me know the status of this server in monitoring pane... critical , healthy , warning?

    Or MS and DB are in same domain or differ domain?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Editado GouravIN lunes, 07 de mayo de 2018 12:23
    lunes, 07 de mayo de 2018 11:49
  • Hi,

    i need to monitor around 100 network device and 20 windows server and have not monitor linux os.

    The DB/DW is connected and healthy.

    Her is the screenshots that I get in my scom.

    Amit


    This is not a big reason... this has 20070 event id on it.

    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    lunes, 07 de mayo de 2018 11:50
  • Hi,

    I have given the SDK service account local administrator right both in ms and database server.


    Mandal Amit

    lunes, 07 de mayo de 2018 14:05
  • Hi,

    I have given the SDK service account local administrator right both in ms and database server.


    Mandal Amit

    Hi Amit,

    As per the best practices all the 3 services should be running using domain account and have admin over SCOM servers.

    And your server ( DB , i guess ) which have this 20070 event id, Is reflecting healthy in SCOM console?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 08 de mayo de 2018 4:03
  • Hi Gaurav,

    All the services using local system account and I have given domain administrator account as local admin rights both in SCOM server and DB server.


    Amit


    • Editado admin07 martes, 08 de mayo de 2018 4:16
    martes, 08 de mayo de 2018 4:15
  • Thanks Amit, This issue came after this modification or earlier?

    If this issue occurred after doing this then i thought you missed some steps, we need to give permission on DB/DW and console and SCOM a well. Suggest you to follow this link

    You will get some light over issue and hope it will fix it, if the issue really belongs to this bucket. 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 08 de mayo de 2018 4:35
  • Thanks Gaurav, 

    The scom and database server both are using local account  in all services during the deployment time. 

    Do I need to change the all the services to domain account.

    While changing the service do i need to configure in SQL operation manager  in user mapping.



    Mandal Amit

    martes, 08 de mayo de 2018 5:07
  • Yeah Amit, Thats why i focused on it.

    Because Once we need to change the account, we need to map user roles in SCOM side and as well database side. Because these account are responsible for all communication from SCOM to DB/DW using kerberos or X905 authentication. 

    So check the link which i provided you above and MAP all the rights in your SCOM.

    Hope this helps :)


    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Editado GouravIN martes, 08 de mayo de 2018 5:21
    martes, 08 de mayo de 2018 5:21
  • Thanks Gaurav,

    I will configure all the services according and inform you accordingly.

    Thanks once again.


    Mandal Amit

    martes, 08 de mayo de 2018 5:23
  • No need of thanks Sir, 

    Yeah! do update us about this.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 08 de mayo de 2018 5:52
  • Hi Gaurav,

    Due to sudden work i need to go my hometown so unable to update the issue of scom.

    The management server is still in critical state and health state of management group function is greyed out.

    Please find the screenshots herewith.


    Mandal Amit

    martes, 15 de mayo de 2018 5:58
  • Hi Amit,

    Is there any error event on MS and DB/DW. And apart from this i suggest stop all the three services and change the name of health service folder on MS and clear cache of SCOM console as well. you can refer those links:-

    http://systemcenter.no/?p=215

    https://om2012.wordpress.com/2013/06/01/opsmgr-2012-how-to-clear-the-cache-in-the-opsmgr-console/ 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 15 de mayo de 2018 6:29
  • Hi Gaurav,

    I have followed all the steps suggested above but with no success. and also i didnot find any error in my DB/DW.

    If possible i can provide you the remote my pc.


    Mandal Amit


    • Editado admin07 martes, 15 de mayo de 2018 7:34
    martes, 15 de mayo de 2018 7:12
  • Hi Gaurav, 

    I have also seen this error in operation manager server. 


    Mandal Amit

    martes, 15 de mayo de 2018 7:49
  • Hi Amit,

    Please confirm earlier MS was healthy?

    Have you configured/Map the account in DB and DW. And i suggest please revert the changes in Local account. And start making changes one by one, first map and configure the account into DB/DW and then perform next step.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 15 de mayo de 2018 8:07
  • Hi,

    this looks more like a consequence of the issue, not like a cause. Can you please right click on the event log and set a filter to display only errors. Please post the the other errors you see in the log after applying the filter. Let's see what elese is there. 

    Thnaks and regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    martes, 15 de mayo de 2018 8:21
    Moderador
  • Hi Gaurav,

    Earlier MS was also in same state. 

    Yes I have configured/map the account in DB and DW as well.


    Mandal Amit

    martes, 15 de mayo de 2018 8:23
  • As Stoyan suggested please refer only error event id and one more thing. Check the health explorer of MS and see if you got something. And one more thing please check connectivity between DB/Dw of MS if there any firewall blockage or etc.

    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 15 de mayo de 2018 8:34
  • Hi Gaurav, I get error event id 26319 during when i recalculate All Management Server Resource Pool in management group function.

    Please find the screenshots of the error that i get.


    Mandal Amit

    martes, 15 de mayo de 2018 10:17
  • Hi Amit,

    If i am not wrong you have only one MS ,right?

    Cheers, Gourav Please remember to mark the replies as answers if it helped.


    • Editado GouravIN martes, 15 de mayo de 2018 10:30
    martes, 15 de mayo de 2018 10:29
  • Yes gaurav,

    I do hav only one MS server.


    Mandal Amit

    martes, 15 de mayo de 2018 10:37
  • Amit, 

    Is there any firewall between DB/DW and MS or are you sure communication is happening within a given time.

    Or Is your MS build on VM?


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 15 de mayo de 2018 10:41
  • yes there is firewall between DB/DW and MS server. 

    and my MS and database are build on different VM.

    Yes I have checked and communication is happening within a given time. Please tell me how can i Verify that my scom is communicating with Database so that i can send u the screenshots for your information.


    Mandal Amit

    martes, 15 de mayo de 2018 10:51
  • Finally, it looks i am close to it.

    "You can try turning off the firewalls between the SCOM Management Server, and the endpoint of DB/DW, to ensure connectivity/communications is working first; and then turn one firewall on (with the necessary ports opened).

    And run the below command on MS servers' CMD 

    Telnet DBserveranme 1433


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    martes, 15 de mayo de 2018 10:55
  • Hi Gaurav,

    As you said i can telnet the DB server port from the ms.

    Done everything but with no success.


    Mandal Amit


    • Editado admin07 miércoles, 16 de mayo de 2018 7:21
    miércoles, 16 de mayo de 2018 6:22
  • It looks everything is fine now,

    • you have map accounts into DB/DW
    • MS can telnet DB
    • DB is running and UP and MS too
    • No Error event in MS except 26319

    Now could you please confirm what is the ping response from MS to DB. How much time is it taking to make first ping request. Is it more then 1 seconds or instant after hitting enter key.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 7:25
  • It takes 1ms after hitting the enter key.

    there no any error in scom server. 

    Im attaching the screenshots of DB server.


    Mandal Amit


    • Editado admin07 miércoles, 16 de mayo de 2018 7:39
    miércoles, 16 de mayo de 2018 7:32
  • All looks fine could you please try to repair Operations manager installation. 

    Before doing this take a Backup of MP , snapshot backup of server.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 7:42
  • Hi Gaurav,

    Should I do repair from Installation media of operation manager.


    Mandal Amit

    miércoles, 16 de mayo de 2018 7:59
  • Yeah, open SCOM installation media on MS and then run the setup. You will get repair option like me:-

    Step 1:-

    Step 2:-

    Note:- you will run this ADMIN account on server as well admin in SCOM and should also have sydadmin on DB. ( preciously a domain account with all these things ).


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 8:13
  • Hi garurav Done repair of operation manager installation but with no succes.

    The issue are not solved yet and did not get new notification since last two weeks.


    Mandal Amit


    • Editado admin07 miércoles, 16 de mayo de 2018 11:00
    miércoles, 16 de mayo de 2018 10:47
  • Opps Amit....

    <<<The issue are not solved yet and did not get new notification since last two weeks.>>> what you are referring here sorry i did not get it.


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 11:05
  • the issue of management server is in critical state is not solved.. and the new state change of my monitoring network device is not showing in alerts.

    Mandal Amit

    miércoles, 16 de mayo de 2018 11:08
  • Please find the screenshot of active alerts of my scom server.

    I need to solve this issue asap.


    Mandal Amit


    • Editado admin07 miércoles, 16 de mayo de 2018 11:15
    miércoles, 16 de mayo de 2018 11:13
  • Could you please confirm all the services running on MS, SQL services and SSRS as well. 

    if yes, delete health folder after stopping the services and examine will there be new config file after some time or not? 


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 11:27
  • Hi Gaurav,

    the health state folder get created after stopping the microsoft monitoring agent and management configuration service. but the config service file is not created. I am posting the screenshots herewith.


    Mandal Amit

    miércoles, 16 de mayo de 2018 11:45
  • Hi gaurav, If you are free then I may give you remote access to my server. my email id is amit.mndl@outlook.com. You can email me for further information.

    Mandal Amit

    miércoles, 16 de mayo de 2018 11:47
  • Amit,

    Hold on! These files will not be deleted and created after restarting the services and do not delete these files as well. (see my screenshot below).

     

    Only you need to check file "opsmgrConnector.config" this should be undated and re created. 

    To check this follow this below:-

    C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Health Service State\Connector Configuration Cache\yourManagementgroupname

     


    Cheers, Gourav Please remember to mark the replies as answers if it helped.

    miércoles, 16 de mayo de 2018 15:04
  • Hi Gaurav as per your guide  i search the config file but did not find in management group.


    Mandal Amit

    jueves, 17 de mayo de 2018 6:04