none
Backups not running at scheduled times but manual consistency checks work RRS feed

  • Question

  • I wasn't sure where to post this as the problem is affecting everything but we mostly backup Hyper-V. Nothing has been backing up since 14th August and no alerts appeared in the monitoring area in that period to indicate that backups weren't running.

    What I have seen is a warning alert in that the database had reached a 2GB size threshold. Even though there was enough disk space on the database server it looks like it stopped trying to write to it because this limit had been reached.  The DPM Alerts Windows event log didn’t have any events since the 17<sup>th</sup>, the DPM Backup Events didn’t have anything since 14<sup>th</sup>. As this is a configurable threshold I increased it to 3GB and performed a consistency check on some VMs that hadn’t been backing up since 14<sup>th</sup> August. These backups worked and new events were written to the DPM event logs as normal.

    I expected the recovery points to be created to schedule after this as normal but even though consistency checks I initiate manually work it appears that DPM isn't even trying to back up anything at the times scheduled in the PG config.

    Wednesday, August 26, 2015 10:14 AM

Answers

  • Further investigation revealed another application installed on the DPM Server host had created a folder in the root of the C drive called 'Program'. The existence of this folder caused problems launching applications under Program Files, because any paths to executables under Program Files that were not wrapped in quotes, would first find C:\Program, which was not a valid win32 executable (which is what was causing our SQL Agent jobs to fail).

    Although wrapping the path to TriggerJob.exe in SQL Agent jobs provided a workaround for us, the removal of the C:\Program folder is obviously the proper fix.

    Tuesday, September 1, 2015 8:41 AM

All replies

  • Hi,

    DPM jobs are initiated by SQL server agent - make sure SQL agent is running.  Check application event log for events from SQLServerAgent on the system hosting the DPM database.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, August 26, 2015 5:58 PM
    Moderator
  • I found the SQL Server agent consistently failing on step1

    The SQL Agent log has a lot of entries like the following:
    Message
    [136] Job 90870dc4-532f-4baf-b584-14d218d96ea2 reported: The process could not be created for step 1 of job 0x8E8FEACF379F1C4BA9F61E1C02F4A1C9 (reason: %1 is not a valid Win32 application)

    The agent is running the command:
    c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server>

    I can't see any NTFS errors in the System event log, what may be causing this?

    Edit: I restored the TriggerJob.exe from backup from a couple days before the problem started and tried a failing job again but got the exact same issue.

    • Edited by SystemInsider Thursday, August 27, 2015 12:41 PM Edit
    Thursday, August 27, 2015 8:55 AM
  • Hi,

    If you open an administrative command prompt and run the c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server> manually, does that succeed ?


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, August 27, 2015 6:25 PM
    Moderator
  • Don't know why I didn't think of trying that yesterday. Oddly copying the command to an Admin command prompt just returned me to a prompt after a second without giving any output, not sure what it was supposed to do or how to tell if it worked but I can't see any errors from the time. However it only did this after I wrapped the full path to the TriggerJobs exe in quotes (because of spaces in the path), it did not have this in the job step. The job worked after I made the same change in the job step.

    It appears then that all the SQL Agent jobs are failing because of this. Why have they changed? if there any way to update them all or am I going to have to made the same change to each manually (there are dozens!)?

    Edit: For anyone with the same issue I updated the job steps by running the following:

    begin transaction
    update sysjobsteps set command=replace(command,'c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe','"c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe"')where command like '%TriggerJob%'
    commit  transaction

    Friday, August 28, 2015 8:36 AM
  • Hi,

    SQL agent does not require the quotes around the path to triggerjob.exe like a command prompt does.   You should simply be able to run a SQL job by right-clicking and then selecting "start job at step..." and then look at the running job in DPM.

    To associate a SQL job number with DPM, you can run the below query in SQL management studio.

    use DPMDB --change db name accordingly
    select 
          sche.ScheduleId as 'SQL agent Schedule Job Name', 
          sche.JobDefinitionId,
          prot.FriendlyName as 'Protection Group' ,
         case 
                when jobd.type = 'C9B259D2-6402-486D-8E36-C6C1ADAE0912' then 'Maintenance job that runs @ midnight'
                when jobd.Type = '3D859D8C-D0BB-4142-8696-C0D215203E0D' then 'Synchronization (file/volume) / Express Full (application)'
                when jobd.Type = '84021B5E-B4DC-9B27-2B7E-3B99BB1225FF' then 'Volume/Share/System State Recovery Point'
                when jobd.Type = '913afd2d-ed74-47bd-b7ea-d42055e5c2f1' then 'Backup to tape (D-T)'
                when jobd.Type = 'B5A3D25C-8EB2-4032-9428-C852DA5CE2C5' then 'Backup to tape (D-D-T)'
                when jobd.Type = 'C4CAE2F7-F068-4A37-914E-9F02991868DA' then 'Consistency Check'
    			when jobd.Type = '5ECC82D0-3475-4E81-8ADD-55B1C1D23DB1' then 'Sharepoint catalog generation'
    			when jobd.Type = '6E7C76F4-A832-4418-A772-8E58FD7466CB' then 'Azure Online backup'
         end
           as Operation,
          jobd.Type as VerbID
    from tbl_SCH_ScheduleDefinition sche
    left join dbo.tbl_JM_JobDefinition jobd
    join tbl_IM_ProtectedGroup prot
    on jobd.ProtectedGroupId = prot.ProtectedGroupId
    on sche.JobDefinitionId = jobd.JobDefinitionId
    where sche.IsDeleted = '0' and jobd.ProtectedGroupId is not null
    order by prot.FriendlyName

     


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Friday, August 28, 2015 6:41 PM
    Moderator
  • Further investigation revealed another application installed on the DPM Server host had created a folder in the root of the C drive called 'Program'. The existence of this folder caused problems launching applications under Program Files, because any paths to executables under Program Files that were not wrapped in quotes, would first find C:\Program, which was not a valid win32 executable (which is what was causing our SQL Agent jobs to fail).

    Although wrapping the path to TriggerJob.exe in SQL Agent jobs provided a workaround for us, the removal of the C:\Program folder is obviously the proper fix.

    Tuesday, September 1, 2015 8:41 AM