locked
SCOM 2007 R2 generating false alerts for services on Exchange Cluster RRS feed

  • Question

  • Hi,

    We have SCOM 2007 R2 configured in our environment and have imported latest MP for exchange (2003 and 2007) monitoring. From last couple of days, we are getting alerts like

    Alert: Mail Submission Service Stopped

    Source: SCC Cluster (<server name>)

    Path: <server name>

    Last modified by: System

    Last modified time: 4/12/2010 10:13:17 PM

    Alert description: The Mail Submission Service on Exchange server <server name> has stopped running

     

    This kind of alert comes for all the services on the cluster node. When we check services state, all the services are running fine. Please help me resolve the issue of such false alerts.

     

    Thank you

    RBhise

     

    Friday, May 7, 2010 2:24 PM

Answers

  • What are you seeing on the agent logs for these servers at the time of the alert?  Have you looked at the monitor to see what it is looking for?  Is agent proxy enabled on all your exchange servers?  Is cluster eventlog replication turned off (not sure that SCC are actual Windows Clusters - will find out soon as we are moving to Exchange 2007 soon).

    To troubleshoot this "bogus" alert you need to first understand what the monitor does.  Then you have to understand the build of this/these boxes.  You also have to groom through the opsmanager event log for odd issues.

    Do you have a test lab with SCOM and Exchange 2007 that you can actually break things to try to replicate this issue?  Are the alerts STALE (old..what is the latency on the alert).  Is this a monitor or a rule?

    More detailed information about your situation is needed in order to really give you some good guidance.

     


    Regards, Blake Email: mengottohotmail.com Blog: http//discussitnow.spaces.live.com
    • Marked as answer by Yog Li Friday, May 14, 2010 9:55 AM
    Wednesday, May 12, 2010 5:43 PM

All replies

  • What changed in the last couple of days?
    Microsoft Corporation
    Monday, May 10, 2010 2:47 PM
  • What are you seeing on the agent logs for these servers at the time of the alert?  Have you looked at the monitor to see what it is looking for?  Is agent proxy enabled on all your exchange servers?  Is cluster eventlog replication turned off (not sure that SCC are actual Windows Clusters - will find out soon as we are moving to Exchange 2007 soon).

    To troubleshoot this "bogus" alert you need to first understand what the monitor does.  Then you have to understand the build of this/these boxes.  You also have to groom through the opsmanager event log for odd issues.

    Do you have a test lab with SCOM and Exchange 2007 that you can actually break things to try to replicate this issue?  Are the alerts STALE (old..what is the latency on the alert).  Is this a monitor or a rule?

    More detailed information about your situation is needed in order to really give you some good guidance.

     


    Regards, Blake Email: mengottohotmail.com Blog: http//discussitnow.spaces.live.com
    • Marked as answer by Yog Li Friday, May 14, 2010 9:55 AM
    Wednesday, May 12, 2010 5:43 PM
  •  Hi RBhise,

    As this thread has been quiet for a while, we assume that the issue has been resolved. At this time, we will mark it as "Answered" as the previous steps should be helpful for many similar scenarios.

    In addition, we’d love to hear your feedback about the solution. By sharing your experience you can help other community members facing similar problems.

    Thanks,


    Yog Li - MSFT
    Friday, May 14, 2010 10:05 AM
  • Hello Yog Li,

    Is it mandatory that we need to disable log replication between cluster nodes? I believe this is happening due to overload of event log replication between cluster nodes and resulting in not responding in timely manner to the monitor which checks service status on the cluster.

    Please suggest.

    -HRP


    Thanks & regards, H R Parikh
    Friday, May 14, 2010 1:15 PM
  • Yes, if you are using SCOM or MOM to monitor a cluster you need to disable cluster event log replication otherwise you will run into odd issues and event rules will report "Error XYZ occured on MYCLUSTERN1" from MYCLUSTERN2 event log and cause major confusion for operators.
    Regards, Blake Email: mengottohotmail.com Blog: http//discussitnow.spaces.live.com
    Monday, May 17, 2010 12:17 AM
  • Hello Blake,

    We have now disabled (Turned OFF) the event log replication between exchange cluster nodes but still we are getting false alerts of services being stopped. Any other suggestion?

    Thanks

    HRP


    Thanks & regards, H R Parikh
    Tuesday, May 25, 2010 7:11 AM
  • Have you downloaded the latest Cluster Management Pack?  If so, there was an early update where you had to REMOVE the original Clustering Management Pack before upgrading to the new one.  You might try deleting the old Management Pack and import the latest.
    Thursday, May 27, 2010 10:34 PM
  • Folks, I have simlar problem - am using SCOM 07 with exchange 2007 - we continuosly get critical errors on the exchange mailbox passive node ( two servers in cluster ) showing FCSmonitor State ans FCScontroller services stoped. We manually do a NetStat to restart services but it keeps getting stopped automatically after a period of time ( maybe becuse its a passive node ) - and we end up getting into this loop. Need your ugent help on this.   
    Wednesday, June 2, 2010 7:22 AM