none
Parallel Jobs RRS feed

  • Question

  • Hi,

    I have a bit of problem with DPM as the data is growing it is taking longer and longer for recovery point and tape backup.

    I changed Max Parallel Job to 3 to run and finish quicker but it does not run parallel in my 2 node cluster. When I change it back to 1. it does throw an error "

    DPM could not find a valid recovery point on disk. " There is no way I could set sequence on it. It is too moody to stick with the sequence as DataSourceGroups.xml file. Is there a way to make it better and faster . Where am I going wrong for parallel jobs?

    I have already gone through few sites including this.

    https://social.technet.microsoft.com/Forums/en-US/c40fa9f4-0162-410b-bb4b-ac4e3c576746/clarification-please-on-what-dpm-could-not-find-a-valid-recovery-point-means?forum=dpmtapebackuprecovery

    • Edited by Mo Yusuf Thursday, October 16, 2014 10:32 AM
    Thursday, October 16, 2014 9:30 AM

All replies

  • Hi,

    It really sounds like you can benefit by upgrading to DPM 2012 R2 + UR3 where you can get true three parallel backups on both nodes simultaneously with very little impact to the cluster.  This would require the Hyper-V cluster to also be running on Windows 2012 R2.

     Best you can do if you cannot upgrade to get disk based RP's made in time between tape backups is to reduce the frequency of the tape backups.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.


    Thursday, October 16, 2014 6:35 PM
    Moderator
  • Hi Mike, It is not a solution on the existing product. Going for R2 would require everything to reinstall from scratch and restore the back up. Is there no other option?
    Thursday, October 16, 2014 7:54 PM
  • Hi,

    lets get some more background information - please answer the following questions.

    Windows versions of Hyper-V nodes and DPM Server.
    DPM Version to include last Update rollup installed.
    SQL Server version for DPMDB - local or remote ?
    Storage used for Hyper-V CSV disks. (SAN ?  if so vendor)
    Are you using hardware VSS Provider ?

     


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, October 16, 2014 8:11 PM
    Moderator
  • As requested.

    Windows versions of Hyper-V nodes and DPM Server. Server 2008 R2 & DPM 2012 SP1
    DPM Version to include last Update rollup installed.   Update Rollup 7  4.1.3453.0
    SQL Server version for DPMDB - local or remote ?       SQL 2008 R2 sp2  10.50.4033
    Storage used for Hyper-V CSV disks. (SAN ?  if so vendor)    HP P2000
    Are you using hardware VSS Provider ? On DPM only software VSS but on cluster

    Provider name: 'Microsoft Software Shadow Copy provider 1.0'
       Provider type: System
       Provider Id: {b5946137-7b9f-4925-af80-51abd60b20d5}
       Version: 1.0.0.7

    Provider name: 'HP StorageWorks P2000/MSA2000 VSS Provider'
       Provider type: Hardware
       Provider Id: {bd04cbf9-212c-4553-9ea5-c5bfb05ccc8f}
       Version: 2.8.0.19

    DPM Server VSS

    Provider name: 'Microsoft Software Shadow Copy provider 1.0'
       Provider type: System
       Provider Id: {b5946137-7b9f-4925-af80-51abd60b20d5}
       Version: 1.0.0.7

    • Edited by Mo Yusuf Friday, October 17, 2014 8:46 AM
    Friday, October 17, 2014 8:29 AM
  • Hi,

    OK - since you have a hardware snapshot provider installed on your cluster nodes, then you do not need to use a DataSourceGroups.xml  to serialize the backups.

    To ensure parallel backups can be perfomed, please double-check the following.

    1) Make sure your hardware provider is installed by running: Vssadmin list providers – it should be listed as a provider.  You already confirmed this.
        Also confirm it's licensed properly so hardware snapshots can be taken and mounted.

    2) Turn off using the software provider by deleting this registry key on each node in the cluster: HKLM\Software\Microsoft\Microsoft Data Protection Manager\Agent\UseSystemSoftwareProvider
    3) Change the MaxAllowedParallelBackups from 1 back to 3 under the following key on the DPM server:  HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\2.0\Configuration\MaxAllowedParallelBackups\Microsoft Hyper-V
    4) Delete the DataSourceGroups.xml under the %programfiles%\Microsoft DPM\DPM\Config directory on the DPM server.
    5) Go through the modify protection group wizard for any all protection group(s) containing hyper-v guest backups.  Don't need to change anything, just click on through and complete the wizard.
    6) All future backups should then use the VSS hardware provider and up to three backups / node can take place in parallel.


    If you have not done so already, also add these registry settings as per http://blogs.technet.com/b/dpm/archive/2011/12/06/troubleshooting-hyper-v-protection-failures-due-to-networking-problems.aspx

    Add the following registry keys to increase the number of retries and decrease the time between retries for CSV ownership.

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent\CSV]
    "CsvMaxRetryAttempt"=dword:000000C8
    "CsvAttemptWaitTime"=dword:0002bf20

    CsvMaxRetryAttempt: Adjusts the maximum number of times (Default is 1) the DPM agent will attempt to claim the CSV volume. The value 0xC8 = 200 times.
    CsvAttemptWaitTime: Adjusts the amount of time in milliseconds to wait between retry attempts. The value 0x2bf20 = 3 minutes.

     

    If you are now getting three parallel backups per node, and they still don't finish in time for daily tape backups, then you need to add another node.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.


    Saturday, October 18, 2014 5:14 PM
    Moderator
  • Add the following registry keys to increase the number of retries and decrease the time between retries for CSV ownership. [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent\CSV] "CsvMaxRetryAttempt"=dword:000000C8 "CsvAttemptWaitTime"=dword:0002bf20 CsvMaxRetryAttempt: Adjusts the maximum number of times (Default is 1) the DPM agent will attempt to claim the CSV volume. The value 0xC8 = 200 times. CsvAttemptWaitTime: Adjusts the amount of time in milliseconds to wait between retry attempts. The value 0x2bf20 = 3 minutes. 1
    2

    3
    Where should I add these this reg key DPM or on both nodes? Thanks



    • Edited by Mo Yusuf Saturday, October 18, 2014 6:25 PM
    Saturday, October 18, 2014 6:21 PM
  • Hi,

    The DPMRA reads those and retries accordingly, so the are added on each node.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Saturday, October 18, 2014 10:49 PM
    Moderator
  • Hi,

    The DPMRA reads those and retries accordingly, so the are added on each node.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.


    So to be clear "retry reg keys need to be added on the nodes not on the DPM server itself?"
    Sunday, October 19, 2014 7:00 AM
  • Hi,

    Yes, these two registry values are added on the Hyper-V cluster nodes.

     [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent\CSV] "CsvMaxRetryAttempt"=dword:000000C8
    "CsvAttemptWaitTime"=dword:0002bf20 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Sunday, October 19, 2014 7:35 PM
    Moderator
  • Hi Mike,

    I followed your instruction for adding reg files

    1)[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent\CSV] "CsvMaxRetryAttempt"=dword:000000C8
    "CsvAttemptWaitTime"=dword:0002bf20 

    removed regkey from Nodes 2)Turn off using the software provider by deleting this registry key on each node in the cluster: HKLM\Software\Microsoft\Microsoft Data Protection Manager\Agent\UseSystemSoftwareProvider
     3)Change the MaxAllowedParallelBackups from 1 back to 3 under the following key on the DPM server:  HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\2.0\Configuration\MaxAllowedParallelBackups\Microsoft Hyper-V
    4) Delete the DataSourceGroups.xml under the %programfiles%\Microsoft DPM\DPM\Config directory on the DPM server. I have not fully tested on CSV yet but File based backup is running one job at a time.

    Monday, October 20, 2014 11:56 AM
  • Hi Mike,

    Here is the screen shot of the CSVs it is weird that it finished the tape backup without recovery backup .

    Recovery backup starts at 18.30 and tape backup at 04.00 but either I am confused or DPM.

    All recovery jobs started same time and one of the CSV recovery job started at 06.34 and nothing backed up for 2 hours. Please see screenshot.

    Tuesday, October 21, 2014 7:52 AM
  • Hi,

    It appears that the recovery point jobs are still being serialized - are you certain that:

    1) Hardware provider is being utilized for snapshots ?
    2) The protection group was modified after removing the DataSourceGroups.xml file ?
    3) The MaxAllowedParallelBackups was set back to 3 on the DPM Server.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, October 21, 2014 3:42 PM
    Moderator
  • I followed all your instruction but after 16 hours there was no data transfer on the disk but tape backup completed from a day before yesterday. Now I 've put everything back as it was but still not getting any luck to get recovery job started. I had to create a new group and add cvs to replicate. It is late and I hope it backs up accordingly. I would check tomorrow morning and update you.
    • Edited by Mo Yusuf Tuesday, October 21, 2014 7:50 PM
    Tuesday, October 21, 2014 7:44 PM