none
Remote DPM monitoring possibilities RRS feed

  • Question

  • Hello.

    We're having a small DPM 2012 SP1 installation (single server, 8 agents). Since we have a heavily heterogenous environment, we're using Nagios for alerting (but the core question is really technology-agnostic). We would like to be able to monitor DPM status remotely, but I'm unable to find any decent solution for doing so. The only way I can think of (although not very good one) is to install ssh daemon on the windows server machine and use it to run power shell script (BTW, is there a simple way to return status as in "Data Source Health" in "Protection" tab?), but it's not a very comfortable solution. Is there a better one?


    For the sake of precision - I want to be able to monitor DPM health from a computer _not running Windows_.
    • Edited by Mr. Invalid Tuesday, June 17, 2014 1:25 PM added info
    Tuesday, June 17, 2014 1:16 PM

All replies

  • We use OpsView which is based on Nagios. We run a powershell script on the DPM server which checks alerts. Here is an example of how to set it up.  Our OpsView server is running CentOS.  The setup only requires some basic changes in nagios.  We use similar scripts to check the health of the DPM pool and other aspects.  Powershell is your friend here.

    In nagios, create a service check such as "check_nrpe -t 30 -H $HOSTADDRESS$ -c check_dpm

    The check will require the NRPE agent to be on the DPM server, and you will need to enable the CheckExternalScript.dll option.  This is done by uncommenting that DLL in the nsc.ini file for OpsView. 

    Add an "[External Scripts]" section to your nrpe configuration such as the following.

    [External Scripts]
    check_dpm=cmd /c echo scripts\check_dpm.ps1; exit($lastexitcode) | powershell.exe -command -
    

    Add a file in the scripts subdirectory that is check_dpm.ps1.  Our script just looks for errors and returns the appropriate response to Nagios.

    import-module DataProtectionManager
    $dpmserver = connect-dpmserver (&hostname) -WarningAction:SilentlyContinue
    $dpmserver.alertcontroller.refreshalerts()
    Wait-Event -timeout 15
    $alerts = @($dpmserver.AlertController.ActiveAlerts.Values)
    $warnings = $alerts | where-object {$_.Severity -eq "Warning"} | Measure-Object | select count
    $errors = $alerts | where-object {$_.Severity -like "Error"} | Measure-Object | select count
    if ($errors.count -ne 0) {
       $status = "Error - One or more error conditions exist and backups need to be checked."
       $returnCode = 2
    }
    elseif ($warnings.count -ne 0) {
        $status = "Warning - One or more warning conditions exist. Log onto DPM server to check the status."
        $returnCode = 1
    }
    else {
        $status = "OK"
        $returnCode = 0
    }
    write-host $status
    exit $returnCode


    Rob

    • Proposed as answer by ip-rob Tuesday, June 17, 2014 5:16 PM
    Tuesday, June 17, 2014 5:15 PM
  • It seems I forgot auth data I used while creating the profile here, so I can't respond from the account I used before. :-/

    Anway, as  I left work yesterday, I remembered about NSClient++ which we're already using, so the NRPE suggestion is a perfectly sane solution. Thank you for the script. It seems to be doing what I needed.

    Wednesday, June 18, 2014 8:42 AM
  • Thanks for the Script. Works on DPM 2012 R2.
    Monday, August 24, 2015 9:36 AM
  • Thanks for the script!

    That is very useful.
    I had this week the problem that no restore points have been created. But there was no error in DPM.

    I am considering at the moment how to monitor it on the basis of the time from the last created recovery point.



    Jesse

    Note: Posts are provided without warranty.



    • Edited by Jesse Reppin Wednesday, December 23, 2015 8:13 AM
    Wednesday, December 23, 2015 8:13 AM