none
DPM 2010 Backup causes Virtual Cluster to Fail RRS feed

  • Question

  • Right, you may have to bear with me a moment whilst I explain our configuration!

    We have two physical Windows 2008 R2 Hyper-V servers that are in a cluster, providing 4 highly available virtual machines, all also running Server 2008 R2. Two of these VMs are domain controllers, and two are member file servers, which have clustered file services and DHCP!

    Very roughly, it can be seen as follows:

     

    HYPER-V01 (Parent) <----Cluster----> HYPER-V02 (Parent)

        DC01 (Child)                                   DC02 (Child)

         FS01(Child) <--------Cluster--------> FS02 (Child)           

     

    We have a separate physical Server 2008 R2 which runs DPM. All of the VMs are located in a CSV on the Hyper-V parents, provided by iSCSI.

    Within DPM, we have 4 protection groups as follows:

    1 - Local DPM Server

    2 - Hyper-V Parent machines - Bare Metal

    3 - Hyper-V Child VMs - Child Partition Snapshots

    4 - File Server Cluster Shares (Cluster nodes located on VMs, FS01 and FS02)

     

    Now, every night the cluster located on the virtual nodes falls over during the DPM backup window. There are several errors in the Cluster Event logs, such as the following:

     - Cluster Network failed...

     - IP Address failed...

     - Cluster node FS01 removed from cluster membership...

     - Cluster node FS02 removed from cluster membership...

     - Quorum was lost...etc etc.

    Basically the whole cluster breaks twice nightly - once at around 6pm, and then again about an hour later. Each time the resources are lost, and the cluster breaks, everything is reestablished and comes backup within 2-3 minutes.

    As an additional note, our Dell MD3200i does not seem to be compatible for parallel hardware VSS backups, so we are using software VSS across the board, and only allow one to run at a time - I think that this is setup correctly.

    I am sure that there is something obvious that I have missed or configured incorrectly but any help diagnosing this would be greatly received! Any more information needed (I'm sure there will be, please ask).

     

    Thanks,

    Charlie.

    Wednesday, June 15, 2011 2:42 PM

Answers

All replies

  • Do you also see the Cluster service stopping when the nodes fail over?  If so, this could be related to 2494162 - "The Cluster service stops unexpectedly on a Windows Server 2008 R2 failover cluster node when you perform multiple backup operations in parallel on a cluster shared volume" at http://support.microsoft.com/default.aspx?scid=kb;en-US;2494162.  There is also an article that lists the all of the hotfixes recommended for failover clusters in 2545685 - "Recommended hotfixes and updates for Windows Server 2008 R2 SP1 Failover Clusters" at http://support.microsoft.com/default.aspx?scid=kb;en-US;2545685.

    Wednesday, June 15, 2011 9:21 PM
    Moderator
  • Thanks for the reply - the Hyper-V parents that backup the Child VMs located in the CSV seem to be behaving and I am not seeing any errors. It is the file server cluster nodes that I am having issues with - both nodes in the file server cluster are highly available virtual machines located in the CSV provided by the Hyper-V parents. I hope that makes sense, I know it somewhat complicated!

    I can't seem to find much documentation at all on creating clusters using Virtual machines, let alone if they are themselves hosted on a Hyper-V cluster!

     

    Thursday, June 16, 2011 4:53 PM
  • Hi, does anyone have any suggestions, or have any experience with using DPM 2010 to backup clustered resources where the cluster nodes are VMs, running on a Hyper-V cluster themselves?

    Thanks,

    Charlie.

    Monday, June 20, 2011 1:07 PM
  • I am also having the same issue with DPM 2010 and a 2 host Hyper-V Cluster with a Dell MD3200i iSCSI SAN using software VSS (setup to only allow one backup at a time).

    Any advice would be appreciated.

    Monday, June 20, 2011 6:26 PM
  • Please make sure the cluster network configuration is optimized for redirected mode communications.


    2473194-DPM 2010 Hyper-V protection Configuring cluster networks for CSV redirected access
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;2473194

    Or this blog - they are both the same information.

    http://blogs.technet.com/b/dpm/archive/2010/12/09/system-center-data-protection-manager-2010-hyper-v-protection-configuring-cluster-networks-for-csv-redirected-access.aspx

     


    Also - as a best practice, these hotfixes maybe useful if you are experiencing agent connectivity issues with Hyper-V guests

    1. For Windows Server 2008 guests 

    The network connection of a running Hyper-V virtual machine may be lost under heavy outgoing network traffic on a computer that is running Windows Server 2008 R2 SP1
    http://support.microsoft.com/kb/2263829

    Consider the following scenario:
    • You install the Hyper-V role on a computer that is running Windows Server 2008 R2 Service Pack 1 (SP1).
    • You run a virtual machine on the computer.
    • You use a network adapter on the virtual machine to access a network.
    • You establish many concurrent network connections. Or, there is heavy outgoing network traffic.

    In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter may be disabled.

    2. For Windows Server 2003 guests:

    Network connection is lost on a Windows Server 2003-based Hyper-V VM
    http://support.microsoft.com/kb/981836

    Description: This problem only affects Windows Server 2003 virtual machines. After you apply the hotfix to the management operating system – you then need to upgrade the integration services for each of the Windows Server 2003 virtual machines.

    For more details see this blog:
    http://blogs.msdn.com/b/virtual_pc_guy/archive/2011/06/13/vm-network-connectivity-dropped-under-high-load.aspx


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, June 21, 2011 12:47 AM
    Moderator