none
DPM 2016 - Tape backups fail while synchronisations are happening MBS RRS feed

  • Question

  • DPM 2016 with MBS regularly fails tape backup jobs due clashes with synchronisation jobs.  Seems to be a new feature of Modern Backup Storage, as we never had this issue with VSS.

    Anyone else seeing this? Does anyone (Mike) have any solutions/work arounds? 

    Monday, February 6, 2017 10:23 AM

All replies

  • Hello

    We have reinstalled our backup-server (DPM 2012 R2, Windows Server 2012 R2) with Windows Server 2016 and DPM 2016 (5.0.247.0).

    All protection groups are working fine when backing up to disk. When we add tape-backup to the protection groups, we often get these Errors "Job failed on replica of <Servername> because of ongoing sychrnonisation". We get it on a simple, physical fileserver where we protect the D:\ drive.

    Before upgrading to dpm 2016 everything worked great.

    Synchronisation frequency: 15 minutes

    Recovery points files: 07:00 / 12:30 / 18:00 Everyday

    Express Full Backup: 01:00 Everyday

    Tape:

    Every 1 Days: 03:00, except on So, Mo, Sa

    Every 1 Weeks: Sa 03:00

    Every 1 Months: Last Sa. 03:00

    Any idea? Thanks for help!

    Greatings from switzerland

    Andreas

    Tuesday, February 7, 2017 7:52 AM
  • Hi,

    I have not heard of this issue before, but it seems two different instances have occurred so it's worth looking into.  Can you tell me the frequency of the failure, like 1 out of 10 scheduled tape backups ?


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, February 7, 2017 7:10 PM
    Moderator
  • We use Virtual Tape Library (Firestreamer) to do offsite backups once a month. It usually happens on one of the our file server data sources. 1 out of 10 protection group.

    This is also the same data source that regularly also fail to backup because of an error that the VHD cannot be mounted (ID 40002). So not shure if this is a seperate issue or just an extension of another one.

    Tuesday, February 7, 2017 11:26 PM
  • Hi,

    DPM 2016 Update Rollup 2 (UR2) released today and that should have the fix for the VHD cannot be mounted (ID 40002) issue.

    Download the update from the download center and run it manually (customers have reported issues installing it from Windows update)

    Update Rollup 2 for System Center 2016 Data Protection Manager

    See if that also helps with the synchronization error.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, February 8, 2017 12:28 AM
    Moderator
  • Thanks for your replies!

    We are getting this error on 9 of 10 scheduled tape backups. Rarely it works without the error.

    We will Install the update rollup and see if it maybee fixes the error.

    We will post our result.

    Best regards

    Andreas

    Wednesday, February 8, 2017 8:43 AM
  • Mike, thanks for your replies.

    Its been about 2-3 tape jobs failing per day, out of around 68 jobs.  As per Christian, it's mainly been file server data sources, however I have seen it happen to SQL and Exchange data sources as well.

    I suspect you're right re the VHD mounting issue.  We've just installed UR2 and synchronisation times have instantly dropped from 20-30 mins on one of our 6TB file data sources to 2 min!

    Will update in 24 hours. 


    • Edited by DJL Wednesday, February 8, 2017 9:12 AM
    Wednesday, February 8, 2017 9:09 AM
  • Hi Mike,

    We've just has the same problem again.

    We sync this data source every 30 mins, so its not possible to set the tape backup to not start at the same time as a sync.

    Waiting for the sync to finish and restarting the tape backup works.

    Should we open a support case?

    Wednesday, February 8, 2017 10:19 AM
  • HI,

    Lets try offsetting the time that the synchronizations start by 10 minutes by adjust the Synchronization start time.  Click the Optimize button for that protection group and set the minutes to 10.  This will then start synchronization jobs 10 minutes later and allow the tape job to start at the top of the hour.  This will hopefully be a good workaround.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, February 8, 2017 3:25 PM
    Moderator
  • Hi Mike,

    Thanks for your replies.  Offsetting the syncs does help considerably, however it doesn't solve the problem completely.  

    We have for example a protection group consisting of 40 SQL databases.  All the tape backup jobs for those databases get queued at the same time.  The sync offset avoids the first job failing, but as DPM works through the jobs, we are still getting collisions between tape backups starting and synchronisations in progress - we then have to manually restart the jobs once the sync has finished.

    Is it worth us opening a case, or is this issue noted/being worked on?

    Thanks!

    Wednesday, February 22, 2017 6:36 PM
  • Hi,

    OK - thanks for the update.  I get it and I'm not aware of any customers calling support for this issue so I don't think it's being worked on for a fix.  Let me ping the product group and see if they have any other ideas to unblock you. 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.


    Wednesday, February 22, 2017 9:34 PM
    Moderator
  • Hi,

    Same Problem here.

    We use Windows Server 2016 and DPM 2016 DPM 2016 with Update Rollup 2 (UR2).

    All protection groups are working fine when backing up to disk. With the tape-backup, we get at least once a day the Error "Job failed on replica of <Servername> because of ongoing synchronisation". Mostly another server.

    Before we changed to dpm 2016 everything worked fine.

    Synchronisation frequency: 15 minutes
    Recovery points files: 07:00 / 12:30 / 18:00 Everyday
    Offsetting the starting time of the syncs did not help  

    Express Full Backup:
    Protectiongroup 1 23:00 Everyday
    Protectiongroup 2, 00.00 Everyday
    Protectiongroup 3, 01.00 Everyday

    Tapebackup:
    Every 1 Days: 03:00, except on So, Mo, Sa
    Every 1 Weeks: Sa 03:00
    Every 1 Months: Last Sa. 03:00

    Hanspeter

    Thursday, February 23, 2017 7:34 AM
  • Hi,

    We have a private fix that addresses this issue if anybody is interested in helping us validate it.   You can open a support case and ask for the fix.  Since it's a code defect in the product you will not be charged for it.  Once we validate the fix we can include it in an upcoming rollup.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, September 5, 2017 9:47 PM
    Moderator
  • Hello Mike,

    We are experiencing the same issue, would it be possible we test this fix as well? Is there an ETA for the next DPM rollup?

    We migrated some of our DPM servers to MBS and altough the disk space saving is pretty much as promised, there is a downside to it in terms of performance and stability, I guess that will be fixed over time so I'm really looking forward to the next rollup.

    When I do a perfmon on the disk containing the MBS vhdx's I see a lot of split I/O and a very high disk queue length, resulting in longer recovery point jobs and slow tape backup jobs.

    Thanks,

    Marc

    Tuesday, October 10, 2017 1:40 PM
  • Hi,

    We have validated the fix and it will be included in the upcoming DPM 2016 Update Rollup 4 (UR4) due out later this month.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, October 11, 2017 3:12 PM
    Moderator