none
DPM2007 SP1 synchronization fails RRS feed

  • Question

  • Hello everyone,

    our DPM server keeps failing to synchronize a single volume on one of the file servers since about 3 weeks. Error has ID 3115 with Details ID 104: Insufficient Resources [...] (0x800705AA).

    I am able to execute a consistency check and afterwards create a manual restore point (without a synchronization), but the next synchronization fails again.

    The DPM Server is a DPM 2007 SP1 x64, build 2.0.8664.0 with the latest hotfix package (KB979970) installed on a Server 2003 Standard R2 x64 SP2 (with all available updates installed). This DPM protects a total of 6 servers and everything else works fine.

    The machine with the failing volume is a Server 2003 SBS SP2 x86 (with all available updates installed). In addition to the failing volume, there are two other protected volumes on that server, several protected SQL databases and system state protection (all of which are protected flawlessly). This machine has a DualCore and 3 GB RAM, but no high cpu / disk / network load or RAM usage (usually about 1GB free).

    The failing volume is by far the largest on that server (2.72TB total, 1.05 TB used). DPM has allocated 1500GB for the replicate volume (about 1070GB used) and 300GB for the snapshot volume (about 45GB used). The size for the journal on the protected server is 15GB.

    I already checked for the VSS updates that are mentioned and recommended around the web, but the latest known is already installed (KB940349). Strangely enough, there are no event log error entries on the server when the sync failure occurs (just the VSS service start and stop entries).

    I also remember that this error already occured over a year ago, which I had fixed with the following registry entries:
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management
    PagedPoolSize: Hex value 41ffffff
    PoolUsageMaximum: Decimal Value 60

    Any other ideas?

    Kind regards, Arne

    • Edited by Arne B_ Thursday, September 9, 2010 3:27 PM fixed typos
    Thursday, September 9, 2010 3:21 PM

Answers

  • Hi,

    The problem can occur on the protected systems when not enough non-paged memory available to the kernel components mainly Dpm VolumeFilter, due to 32 bit architecture limitation.

    On a 32 bit system, the maximum available non-paged pool memory is 256 MB irrespective of physical RAM size.
    This memory is shared by various kernel components including VolSnap and Dpm VolumeFilter.

    Dpm VolumeFilter requires 8 MB + additional contiguous 8 MB for every 1 TeraByte of volume size.
    [For 2TB volume this becomes 16MB + 8MB for a total of 24MB contiguous memory]

    When the memory becomes fragmented (which easily happens after running for sometime),
    The contiguous XMB memory allocation fails causing this error (0x800705aa).

    To confirm this, run the poolmon http://support.microsoft.com/kb/177415 (during start of backup) and see what components are consuming the non-paged memory.

    Also the Workaround/Solutions include:


    1. Go with a 64 bit system. - This will solve the problem, but is a very hard requirement to change the system
    2. Go with Smaller volumes - This relaxes the contiguous memory requirement and reduces the frequency of failures.
    3. Check to see if temporarily stopping SQL (to free up non-paged memory) allows a CC to succeed. If so, we could configure DPM to only synch right before RP, and do it after working hours, then use a pre-backup - post-backup script stop and re-start sql.

    See Using Pre-Backup and Post-Backup Scripts
    http://technet.microsoft.com/en-us/library/bb808870.aspx

     

     

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Friday, September 10, 2010 3:26 PM
    Moderator