none
DPM2012 H-V host backup using HW VSS writer RRS feed

  • Question

  • Hi. I'm trying to get a clarification on how exactly does DPM2012 work when backing up VMs in a Hyper-V (2008 R2) cluster while using SAN HW VSS writer.

    In one of the other threads Mike Jacquet mentioned: "DPM will use it's filter driver to track block level changes to .VHD's and only transfer changed blocks without having to read the entire VHD for a VM.  However, Hyper-v clusters using CSV will still use the method used in DPM 2010 today."

    That's what we're currently observing with a SAN without HW VSS writer .... DPM read the WHOLE VHD of the VM and then transfers the changed blocks. It's quite disappointing that this hasn't changed in DPM2012.

    Anyway, how does DPM process in the case of HW VSS writer? DPM requests a snapshot to be created by the SAN for given CSV LUN where the to-be-backed-up VM resides and then what? Does it read the whole CSV LUN or the whole VHD of the VM from the SAN snapshot? Or does it actually only read the changed blocks in this case?

    Wednesday, March 14, 2012 12:27 PM

All replies

  • Hi. I'm trying to get a clarification on how exactly does DPM2012 work when backing up VMs in a Hyper-V (2008 R2) cluster while using SAN HW VSS writer.

    In one of the other threads Mike Jacquet mentioned: "DPM will use it's filter driver to track block level changes to .VHD's and only transfer changed blocks without having to read the entire VHD for a VM.  However, Hyper-v clusters using CSV will still use the method used in DPM 2010 today."

    That's what we're currently observing with a SAN without HW VSS writer .... DPM read the WHOLE VHD of the VM and then transfers the changed blocks. It's quite disappointing that this hasn't changed in DPM2012.

    Anyway, how does DPM process in the case of HW VSS writer? DPM requests a snapshot to be created by the SAN for given CSV LUN where the to-be-backed-up VM resides and then what? Does it read the whole CSV LUN or the whole VHD of the VM from the SAN snapshot? Or does it actually only read the changed blocks in this case?

    It works in the same way: snapshot is taken, mounted as read-only and CBT (Changed Block Tracker) is ran against previous and current LUN states. The only difference is with hardware VSS snapshot is taken by storage appliance so it costs very little to Windows to handle it and with software snapshot whole volume being snapshotted is put into performance degradated mode as writes are tracked and COW (Copy-on-Write) is used by Windows to protect overlapped data writes. No, hardware VSS does not return VHD or whatever as everything works at block (LUN) rather then file (NTFS) layer. All Hyper-V backup software work this way (may offload CBT to storage appliance but these are details). So you really need to have SAN with hardware VSS.

    Hope this clarification helped :)

    -nismo

    Thursday, March 15, 2012 11:32 AM
  • Thanks, that clarifies it a bit for me.

    Just to make sure I understand that correctly - SAN takes snapshot of the LUN (the actual CSV is in Redirected I/O mode just for the duration of the snapshot creation - unlike with SW VSS provider) which is then mounted/made available for DPM. DPM's CBT then reads the whole LUN and transfers changed blocks?

    So say I have a 16TB LUN - SAN makes a snapshot, will DPM then read the full 16TB of data or just directly the changed blocks? Maybe I'm kind of confused here, because different SAN vendors implement different snapshot methods...

    Thursday, March 15, 2012 11:59 AM
  • Thanks, that clarifies it a bit for me.

    Just to make sure I understand that correctly - SAN takes snapshot of the LUN (the actual CSV is in Redirected I/O mode just for the duration of the snapshot creation - unlike with SW VSS provider) which is then mounted/made available for DPM. DPM's CBT then reads the whole LUN and transfers changed blocks?

    So say I have a 16TB LUN - SAN makes a snapshot, will DPM then read the full 16TB of data or just directly the changed blocks? Maybe I'm kind of confused here, because different SAN vendors implement different snapshot methods...

    DPM will read VHD content only. It does not care about unknown stuff stored on CSV and it's snapshot or unprotected VMs. So LUN size does not matter.

    -nismo

    Thursday, March 15, 2012 9:41 PM
  • If the above suggested answer did not help in resolving the thread please re-open it.
     --------------------------------------------------------------------------------
     Regards, Michael V [MSFT] - This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, April 2, 2012 10:12 PM
    Moderator
  • I can't let this go ..

    Imagine a situation where there are say 30 or 50TB of VHDs (VMs) on CSV volumes, all of which we want to protect. How can DPM (probably 2012 with the increased limits) backup this amount of data (or more) if it really has to READ ALL that data before it can write the changes (however small they might be).

    30TB of data is nothing huge these days and you would need SUPER fast SAN to even read this amount of data in sensible timeframe. Say the SAN can handle 500MB/s for sequential read (which is nothing shabby), it would still take more than 12 days to just read the data...

    What if there are 100TB or more of data? Is this a DPM-related problem or does it affect every backup software that can backup CSV (in its current incarnation)?


    • Edited by MarkosP Tuesday, April 17, 2012 10:38 AM
    Tuesday, April 17, 2012 10:30 AM
  • My understanding is that they can't use the filter driver to track changes in a CSV scenario because which host owns and is writing to a .VHD could change between backups (i.e. a server goes down or you move a guest to a different machine in the cluster) which would cause the new host to start tracking changes with it's filter driver but not have the complete list of changes.  

    To resolve this, the tracked changes would then have to be coalesced for all the hosts in the cluster in order to get the complete list of changed blocks.  Additionally, the filter drivers would have to store the change block list somewhere that could survive a complete host shutdown and unavailability during the next backup.

    If you read the technical information on the new VEEAM product, it does exactly that.  It tracks the changes for all .VHDs on each host and I think writes the change block list to the CSV or somewhere so that even if a server goes down, it can coalesce all the changed blocks since the last backup.  Don't quote me on this though, I'm not 100% sure.  Since VEEAM does not support the Disk-to-Disk-to-Offsite model that DPM uses I stopped researching it for now.

    It would be great if the DPM team could figure something like this out though.

    I'm also curious how they will handle Windows Server "8" with the replication feature as it seems it could have the same problem.  If a .VHD is hosted on server A and then is moved to server B but the replicated copy is used, how will DPM track changes made both before and after the move, if for example server A goes down entirely?

    Feel free to correct me if I'm wrong on any of the above, this is just my impression of this issue.

    Tuesday, April 17, 2012 2:27 PM