SP1 Monitor...Log Files, Text Log, Repeated Event Detection

Answered SP1 Monitor...Log Files, Text Log, Repeated Event Detection

  • Wednesday, June 03, 2009 4:05 AM
     
     
    Hi,
    I am trying to monitor for TEXT in a generic log file (Not an Event Viewer Log) but get stuck at several points:
    Build Repeated Event Expression and the Counting Mode, on the next screen.

    So the scenario is like this:
    I have a series of files.... XYZ.0, XYZ.1, XYZ.2 etc up to XYZ.9 on one server, Server1.

    Within the latest of those files (XYZ.....) I want to find the text, as it is created,  "Failed to establish a connection to SQL Server."
    After a repeat count of 3x within 1 min, an alert should be triggered.

    Can anyone pls point out how to do these steps, or a website where the DETAILS are discussed?
    Here is the full line from one of the text files that needs scanning....

    05/25 13:23:08.13 INFPWAA0 sqlagt:select2 3480 ERROR sqlclass.c 668 Failed to establish a connection to SQL Server. ReasonUnknown Error

    I am thinking from the above that the Parameter Name = 9 (or maybe it should be 668??), Operator = Contains, and Value = Failed to establish a connection to SQL Server.

    Pls correct any errors, My brain is fried trying to figure it out.....
    Thankyou,
    John Bradshaw
    • Edited by bradje Wednesday, June 03, 2009 4:26 AM
    •  

All Replies

  • Wednesday, June 03, 2009 4:40 AM
    Moderator
     
     
  • Wednesday, June 03, 2009 10:35 AM
    Moderator
     
     
    as in the doc mentioned by Alexey, you use the generic log file provider, therefor the whole line is parameter 1 and you should searchg on a substring instead of using different paramaters

    HTH

    Greetz,

    Arie de Haan
    MVP SCOM
    This posting is provide "AS IS" with no guarantees, warranties, rigths etc.
  • Friday, June 05, 2009 2:12 AM
     
     
    Umm, Thx for the replies, but the link seems like Chinese to me. Sorry to be so thick.

    So, under the Repeated Event Expression, what should I put in the
    Parameter Name field
    Operator Field
    Value Filed
    if I want to search for "Failed to establish a connection to SQL Server"?
    Thx,
    John Bradshaw
  • Friday, June 05, 2009 2:18 AM
    Moderator
     
     Answered
    Umm, Thx for the replies, but the link seems like Chinese to me. Sorry to be so thick.

    So, under the Repeated Event Expression, what should I put in the
    Parameter Name field
    Operator Field
    Value Filed
    if I want to search for "Failed to establish a connection to SQL Server"?
    Thx,
    John Bradshaw

    Parameter Name field: Params/Param[1]
    Operator: Contains
    Value: Failed to establish a connection to SQL Server
    Directory: C:\LogFilesDir\
    Pattern: XYZ.*

    http://OpsMgr.ru/
    • Marked As Answer by bradje Friday, June 05, 2009 3:45 AM
    •  
  • Friday, June 05, 2009 3:44 AM
     
     
    Fantastic...thankyou very much
    John Bradshaw
  • Friday, June 05, 2009 8:49 PM
    Moderator
     
     
    Hi,
     
    I did a lot of tests regarding ops mgr and log files, take a look at http://contoso.se/blog/?p=281< /A>
    "bradje" wrote in message news:5531479a-7588-429 6-b465-db9859484bef...
    Fantastic...thankyou very much
    John Bradshaw
  • Tuesday, June 09, 2009 6:34 AM
     
     
    Thx again guys....I still cannot get the alerts working for this one.

    My files are called stdlog.1, stdlog.2 and stdlog.3 etc up to stdlog.9
    The text I want to alert on is     Failed to establish a connection to SQL Server

    So I have set up the Directory on a remote server that I know I have access to.
    The pattern is  stdlog.*
    The Repeated Event Expression is:
    Params/Param[1]    Contains    Failed to establish a connection to SQL Server

    The repeated Event detection I have set to Trigger on Count = 2 Based on interval of 5 min

    Health Manual Event Raised Healthy
    Repeated Event Raised Critical


    Then I open up a text file, copy the line (Failed to establish a connection to SQL Server)  2 or 3 times and save the file each tim I make a copy of the text
    No alert comes through.....

    Sorry to be so dense on this, but what am i doing wrong??
    Thankyou

    John Bradshaw
  • Tuesday, June 09, 2009 8:36 AM
    Moderator
     
     Answered
    Example (tested in my lab :)):

    <UnitMonitor ID="UIGeneratedMonitor8df552054ce74c7ba65107c38d45e56b" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Server.2003.Computer" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="ApplicationLog!System.ApplicationLog.GenericLog.RepeatedEventTimer2StateMonitorType" ConfirmDelivery="true">
            <Category>Custom</Category>
            <AlertSettings AlertMessage="UIGeneratedMonitor8df552054ce74c7ba65107c38d45e56b_AlertMessageResourceID">
              <AlertOnState>Error</AlertOnState>
              <AutoResolve>true</AutoResolve>
              <AlertPriority>Normal</AlertPriority>
              <AlertSeverity>Error</AlertSeverity>
              <AlertParameters>
                <AlertParameter1>MyTEXT</AlertParameter1>
              </AlertParameters>
            </AlertSettings>
            <OperationalStates>
              <OperationalState ID="UIGeneratedOpStateId4a29ce43dc4a443c96d7f53c9cdc63f1" MonitorTypeStateID="TimerEventRaised" HealthState="Success" />
              <OperationalState ID="UIGeneratedOpStateIde97a908116de496b8fc6034e967ed43c" MonitorTypeStateID="RepeatedEventRaised" HealthState="Error" />
            </OperationalStates>
            <Configuration>
              <RepeatedLogFileDirectory>c:\logDir</RepeatedLogFileDirectory>
              <RepeatedLogFilePattern>log.*</RepeatedLogFilePattern>
              <RepeatedLogIsUTF8>false</RepeatedLogIsUTF8>
              <RepeatedExpression>
                <RegExExpression>
                  <ValueExpression>
                    <XPathQuery Type="String">Params/Param[1]</XPathQuery>
                  </ValueExpression>
                  <Operator>ContainsSubstring</Operator>
                  <Pattern>Test warning message</Pattern>
                </RegExExpression>
              </RepeatedExpression>
              <Consolidator>
                <ConsolidationProperties />
                <TimeControl>
                  <WithinTimeSchedule>
                    <Interval>120</Interval>
                  </WithinTimeSchedule>
                </TimeControl>
                <CountingCondition>
                  <Count>3</Count>
                  <CountMode>OnNewItemTestOutputRestart_OnTimerRestart</CountMode>
                </CountingCondition>
              </Consolidator>
              <TimerWaitInSeconds>120</TimerWaitInSeconds>
            </Configuration>
          </UnitMonitor>

    http://OpsMgr.ru/
    • Marked As Answer by bradje Tuesday, June 09, 2009 8:32 PM
    •  
  • Tuesday, June 09, 2009 5:59 PM
     
      Has Code
    I wonder if the log provider doesn't like it when a file is opened in notepad for adding events. (I know this was true in MOM2005). It could also be that since you're adding it 2-3 times AT THE SAME TIME in notepad, it's not seeing it.
    Try either doing  an echo 3 times (with a delay)
    echo "Failed to establish a connection to SQL Server" >> mylog 

    or open notepad, put it in once, file/save/exit then repeat the operation a couple more times with a delay of 10-20 sec in between.

    "Fear disturbs your concentration"
  • Tuesday, June 09, 2009 8:32 PM
     
     
    Thankyou so much...Working now!! :)
    John Bradshaw
  • Tuesday, June 09, 2009 8:59 PM
     
     
    Oh, and if anyone was wondering, Mike Eisenstein posted the following as an explanation for the various alerting triggers. It was very helpful:

    "if the monitor detects 20 errors in 15 minutes i want an alert to be raised"

    1. Trigger-on-timer is similar to the MOM 2005 consolidation rule.  It will

    raise a single alert (monitor state change) for any number of events in the

    defined time period

     

    2. The Trigger-on-count/Time Interval option will start the clock when you

    get the first event.  If you get 19 more in the following 15 minutes it will

    change the state of the monitor and raise an alert.  At the end of 15

    minutes, the alarm with the repeat count will be raised.  In this case, if

    you get an event at 1:12, 10 more events at 1:26, then 12 more at 1:28 no

    alert will be raised because the timer started at 1:12 and you didn't get 20

    before 1:27.

     

    3. Trigger-on-count/Fixed-simple-recurring-schedule, will arbitrarily start

    the 15 minute timer and look for 20 events.  If you "synchronize" the simple

    schedule on the hour, you could look for 20 events between 2:00 and 2:15,

    2:15-2:30, 2:30-2:45, etc.

     

    4. Trigger-on-count,Sliding is like "density" of events.  If in any

    15-minute window you get your 20 events, you'll get the state change.  So

    for the example in (2) above, you would get an alert because between 1:26

    and 1:28 (only two minutes), you got 22 events.
    http://www.eggheadcafe.com/conversation.aspx?messageid=31598576&threadid=31598574

    JB