none
Synchronization on CSV causes issues. RRS feed

  • Question

  • So i had some issues and all my backups to get into a bad state. To fix it i have to run a synchronization. However when i do that a node in my failover cluster just starts eating RAM. The first time i did it it eat all of it almost to the point of the node being unresponsive. Cancelling the job frees up the ram... These nodes have over 60GB free ram each and it will take ever bit of it....

    *EDIT* this is only running one synchronization job at a time.

    One bit of side info is the first time i did it it was the node the guest was on... I migrated the guest to another box and backed up again but i believe it was still the first node that the ram got used up.

    My backup box is currently off..

    Please help!

    *EDIT2*

    more info.

    DPM 2012 sp1.

    Server 2012 HyperV failover cluster. 3 Nodes 100GB ram each.



    • Edited by JustusIV Wednesday, October 16, 2013 12:38 PM
    Wednesday, October 16, 2013 12:21 PM

Answers

  • Ah... I know your pain. There is a problem with Windows Server 2012 and Hyper-V backups, it's not DPM's fault.

    If you go here:

    http://blogs.technet.com/b/cedward/archive/2013/05/24/validating-hyper-v-2012-and-failover-clustering-2012-hotfixes-and-updates-with-powershell.aspx

    Run this script against the cluster and it will tell you what hotfixes are missing.

    When I ran into this problem before I rebuilt the cluster, 1 node at a time, in this order:

    1. Created a VM to be my Gold Image
    2. Downloaded all the hotfixes I needed from MS site (takes a while)
    3. Install Windows Server 2012 (whichever edition you need, I presume DC) RTM with no updates
    4. Use PkgMgr to force the hotfixes in to the OS (http://technet.microsoft.com/en-us/library/cc749465(v=WS.10).aspx) command line could be pkgmgr.exe /ip /m:<path to .cab>
    5. Once all the Hotfixes (not Windows Updates) are in make a copy of the machine (VHDX)
    6. Install all available Windows Updates (with the exception of the latest Update Rollup unless it is known to be OK)
    7. Sysprep the machine
    8. Use VMM, to deploy the VHDX file to each host machine (booting the hosts from local VHDX), one by one slowly rebuilding the cluster.

    It worked for me doing it in this order but it takes time.

    • Marked as answer by JustusIV Tuesday, October 22, 2013 1:03 PM
    Thursday, October 17, 2013 10:05 AM

All replies

  • Ok so a bit more info after digging a bit more......

    I used sysinternals rammap and figured out that it is a "Mapped File" taking all the ram. Which is the VHD that is the one running the consistency check. So the ram usage is the size of the VHD. The problem is that is starves the HyperV cluster node of ram.

    Wednesday, October 16, 2013 3:28 PM
  • Ah... I know your pain. There is a problem with Windows Server 2012 and Hyper-V backups, it's not DPM's fault.

    If you go here:

    http://blogs.technet.com/b/cedward/archive/2013/05/24/validating-hyper-v-2012-and-failover-clustering-2012-hotfixes-and-updates-with-powershell.aspx

    Run this script against the cluster and it will tell you what hotfixes are missing.

    When I ran into this problem before I rebuilt the cluster, 1 node at a time, in this order:

    1. Created a VM to be my Gold Image
    2. Downloaded all the hotfixes I needed from MS site (takes a while)
    3. Install Windows Server 2012 (whichever edition you need, I presume DC) RTM with no updates
    4. Use PkgMgr to force the hotfixes in to the OS (http://technet.microsoft.com/en-us/library/cc749465(v=WS.10).aspx) command line could be pkgmgr.exe /ip /m:<path to .cab>
    5. Once all the Hotfixes (not Windows Updates) are in make a copy of the machine (VHDX)
    6. Install all available Windows Updates (with the exception of the latest Update Rollup unless it is known to be OK)
    7. Sysprep the machine
    8. Use VMM, to deploy the VHDX file to each host machine (booting the hosts from local VHDX), one by one slowly rebuilding the cluster.

    It worked for me doing it in this order but it takes time.

    • Marked as answer by JustusIV Tuesday, October 22, 2013 1:03 PM
    Thursday, October 17, 2013 10:05 AM
  • I wanted to say thank you... We had a major failover cluster failure on tuesday do to this patch

    http://support.microsoft.com/kb/2869923

    We know now why and can fix it so it doesnt happen again.

    I haven't got to my other issue... The crazy ram usage of DPM in the form of a "mapped file" but it explains the last part of a bad chain of events that happened to me

    Friday, October 18, 2013 3:50 PM
  • Is the RAM issue on the DPM or the Hyper-V node? It's usually the Hyper-V node with this.

    Don't forger to mark as answer or something...

    Thanks

    Friday, October 18, 2013 3:56 PM
  • Its on the HyperV node.

    I will leave this unanswered for the time being as i have not actually got a solution for my original issue.

    One of the hotfixes was for another major failure we had. I am actively working on it and will keep everyone posted.

    Monday, October 21, 2013 5:11 PM
  • Hi, the RAM eating is a hotfix Microsoft released to fix the issue. It's a known problem.
    Monday, October 21, 2013 7:31 PM
  • Just updating as promised.... One of the patches from this location must have fixed it.

    http://support.microsoft.com/kb/2784261

    Thank you for pointing me in the right direction.

    Tuesday, October 22, 2013 1:03 PM
  • No problem
    Thursday, October 24, 2013 11:23 AM