none
DPM 2012 R2 Roll Up 2: Agent Terminating or Checksum mismatch during backups RRS feed

  • Question

  • Hi There,

    I have dual DPM server deployment. Primary DPM server backs up to disk in my production site and then I have a Secondary DPM server creating replicas to Disk and Tape in my DR site. Both DPM servers are running DPM 2012 R2. I recently installed DPM 2012 R2 Roll Up 2 (Was running Roll up 1 for weeks without issue), careful not to install the bugged version of the update. Everything was running normally and then suddenly about 4 days later some of my resources started failing and became inconsistent on my Secondary DPM server.

    Either:

    1. My DPMRA service on my Secondary DPM server terminates with the following Application Log error

    Faulting application name: DPMRA.exe, version: 4.2.1235.0, time stamp: 0x536a4419
    Faulting module name: KERNELBASE.dll, version: 6.2.9200.16864, time stamp: 0x531d34d8
    Exception code: 0x80004005
    Fault offset: 0x0000000000047b8c
    Faulting process id: 0x1f18
    Faulting application start time: 0x01cf8a0713e8c930
    Faulting application path: C:\Program Files\Microsoft System Center 2012 R2\DPM\DPM\bin\DPMRA.exe
    Faulting module path: C:\Windows\system32\KERNELBASE.dll
    Report Id: 19d99bee-f5fd-11e3-93fe-782bcb479b76
    Faulting package full name: 
    Faulting package-relative application ID: 

    OR

    2. DPM has detected a checksum mismatch during data transfer over the network (ID: 30087)

    I have already removed and deleted from disk the failing resources on the Secondary DPM server and then re-added them to the protection groups, but again either the agent terminates or fails with checksum during replica creation or consistency check. Note that the Primary DPM server is happy and all the resources are green and backing up successfully numerous times a day. 

    I suspect that the update 4 days before the issues started might have created this issue. Has anyone had these issues or know of potential ways to fix it?

    Your help would be greatly appreciated as I really don't want to rebuild the Secondary DPM server.

    Thanks

    Jacques

    Tuesday, June 17, 2014 2:16 PM

Answers

  • Turned out to be the ISP.

    One of the fibres between their POPs were faulty and introducing CRC errors on the connection. DPM does not like a dirty link. Took me 3 weeks to convince the ISP, no issue since replacing the fibre. 

    • Marked as answer by Jacques v Zyl Tuesday, June 2, 2015 6:40 AM
    Tuesday, June 2, 2015 6:40 AM

All replies

  • Hi,

    Here is an update. I have managed to get a workaround.

    If I enforce IPSec with ESP between my primary and secondary DPM server then the agent stops terminating or reporting checksum errors. 

    Still waiting on someone from the DPM team to give a suggestion here as it seems to be a bug that started after I installed Roll Up 2 for DPM 2012 R2

    Thanks

    Jacques

    Monday, June 30, 2014 2:17 PM
  • Hi Jacques

    Did the DPM team come back to you ?

    Thanks

    Friday, July 25, 2014 12:21 PM
  • Turned out to be the ISP.

    One of the fibres between their POPs were faulty and introducing CRC errors on the connection. DPM does not like a dirty link. Took me 3 weeks to convince the ISP, no issue since replacing the fibre. 

    • Marked as answer by Jacques v Zyl Tuesday, June 2, 2015 6:40 AM
    Tuesday, June 2, 2015 6:40 AM