none
Primary DPM 2012R2 server crashes (reboot) after adding a secondary DPM RRS feed

  • Question

  • Hi there,

    I have the following issue - my primary DPM server started crashing (reboot) after adding a secondary DPM server. This happened even from the beginning, after attaching agent from primary server to the secondary and trying to browse protected servers on primary - in order to create new (the same, actually) protection groups on the secondary. I managed somehow to create these new protection groups on the secondary DPM (after many crashes of the primary), but now it continues - at least once per day. Both servers are the same DPM version - 4.2.1603.0. 

    Wednesday, April 18, 2018 2:24 PM

Answers

  • Hi, 

    Thanks for the update. The error that i see in the DPMRA logs as per that error could you please check few things such as:

    This could be a problem with DCOM permissions on the primary server, so please add the secondary DPM computer account in the following groups on primary DPM server:

    DPMRADmTrustedMachines

    DPMRADCOMTrustedMachines 

    Distributed COM Users

    After that also make sure that there are no data sources that are protected by primary server first and now are protected by secondary DPM server directly as this could cause inconsistencies in the DPM database.

    Most of the DPM crash are caused by inconsistencies in the DPM database. In case the problem persists i request please open a ticket with Microsoft Support.

    -----------------------------------------------------------------------

    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. This posting is provided "AS IS" with no warranties, and confers no rights.


    Friday, April 20, 2018 4:49 AM

All replies

  • Hi, Could you please tell me whether the DPM console is crashing or the DPMRA ? Also please post any error or event that you see generated on the primary DPM server while adding protection onto the secondary DPM server. =======================================================

    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, April 19, 2018 3:47 AM
  • Hi Rahul,

    thanks for the reply.

    Just to make this clear - I installed a secondary DPM server, opened (on secondary) DPM console, attached agent of the primary (was ok) and started creating protection groups on the secondary. And just after browsing these groups on the primary, primary crashes - with minidump file and reboots. No error message in Event Viewer.  I tried to analyze minidump file with Windbg and I have this:

    Probably caused by : ntkrnlmp.exe ( nt!PspCatchCriticalBreak+a4 )

    I strongly suspect that DPMRA.exe is involved in this. 

    Regards,


    Thursday, April 19, 2018 8:08 AM
  • Hi Alex,

    Analyse the logs of DPMRA.exe on primary server while reproducing the issue.

    Post the logs here I'll have a look and let's see then what's happening if the DPMRA.exe is crashing or not. As i have seen few cases when enumerating primary DPM on secondary DPM server it crashes. Would like to know if this is also affecting your backups on primary DPM server.

    Known issues with Secondary DPM:

    https://blogs.technet.microsoft.com/dpm/2008/07/11/known-issues-and-recommended-solutions-on-secondary-protection-disaster-recovery/

    -------------------------------------------------------

    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. This posting is provided "AS IS" with no warranties, and confers no rights.



    Thursday, April 19, 2018 11:06 AM
  • Hi Rahul,

    This is the log - I cut a lot, because of limitation. But you can see the gap - crash happened at 13:37

    0F18 1154 04/19 13:37:34.363 22 genericthreadpool.cpp(824) [0000000000B5C6D0] NORMAL Hr: = [0x80070002] CGenericThreadPool::m_dwMaximumNumberOfThreads[20]
    0F18 1154 04/19 13:37:34.363 20 commengine.cpp(345) [0000000000F37130] ACTIVITY Hr: = [0x80070002] ConnectionNoActivityTimeout: 300
    0F18 1154 04/19 13:37:34.363 20 commengine.cpp(359) [0000000000F37130] ACTIVITY Hr: = [0x80070002] m_dwConnectionNoActivityTimeoutForSlowReceiver: 3600
    0F18 133C 04/19 13:37:34.379 03 timer.cpp(490) ACTIVITY CTimerMgr::TimerThread
    0F18 133C 04/19 13:37:34.379 03 timer.cpp(503) [0000000000F91358] ACTIVITY CTimerMgr::TimerThreadInternal
    0F18 1154 04/19 13:37:34.379 29 dpmra.cpp(182) [0000000000B5A360] NORMAL CDPMRA::Initialize [0000000000B5A360]
    0F18 1154 04/19 13:37:34.379 31 vaagent.cpp(227) NORMAL CVAAgent: Initialize globals
    0F18 1154 04/19 13:37:34.395 29 mta.cpp(126) ACTIVITY CMTA::AddSchemasToSchemaCache => Initialize loaded schemas
    0F18 1154 04/19 13:37:34.410 29 dpmra.cpp(200) [0000000000B5A360] NORMAL CDPMRA::Initialize loaded schemas 
    0F18 1154 04/19 13:37:34.410 22 genericthreadpool.cpp(75) [0000000000F99930] NORMAL CGenericThreadPool: constructor
    0F18 1154 04/19 13:37:34.410 22 genericthreadpool.cpp(824) [0000000000F99930] NORMAL Hr: = [0x80070002] CGenericThreadPool::m_dwMaximumNumberOfThreads[256]
    0F18 1154 04/19 13:37:34.410 29 dpmra.cpp(287) [0000000000B5A360] NORMAL CreateDefaultSubTask [0000000000B5A360]
    0F18 1154 04/19 13:37:34.410 29 dpmra.cpp(210) [0000000000B5A360] NORMAL Temporary storage path reg key found
    0F18 1154 04/19 13:37:34.410 29 dpmra.cpp(226) [0000000000B5A360] NORMAL MTA Temp location Path = [C:\Program Files\Microsoft System Center 2012 R2\DPM\DPM\temp\MTA\]
    0F18 1154 04/19 13:37:34.410 29 mta.cpp(209) NORMAL Hr: = [0x80070005] CMTA::Initialize => Could not delete file = [C:\Program Files\Microsoft System Center 2012 R2\DPM\DPM\temp\MTA\PSInfo]
    0F18 1154 04/19 13:37:34.410 29 mtamethrottlehelper.cpp(113) [0000000000F9CC90] NORMAL Hr: = [0x80070002] Number of encryption threads = [2]
    0F18 1154 04/19 13:37:34.410 03 datasourceresourcelimit.cpp(141) [0000000000B5CCD0] NORMAL Resource Limit Config File: C:\Program Files\Microsoft System Center 2012 R2\DPM\DPM\bin\DsResourceLimits.xml
    0F18 1154 04/19 13:37:34.410 03 datasourceresourcelimit.cpp(147) [0000000000B5CCD0] NORMAL Loaded DS Resource Limits
    0F18 1154 04/19 13:37:34.410 03 service.cp15B0 1364 04/19 13:45:25.097 03 timer.cpp(490) ACTIVITY CTimerMgr::TimerThread
    15B0 1364 04/19 13:45:25.097 03 timer.cpp(503) [0000000001245020] ACTIVITY CTimerMgr::TimerThreadInternal
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(786) [0000000000DAC6D0] ACTIVITY InitializeSecurity found 3 authorized entities
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(845) [0000000000DAC6D0] ACTIVITY InitializeSecurity added COM perm for DPMRADCOMTrustedMachines
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(845) [0000000000DAC6D0] ACTIVITY InitializeSecurity added COM perm for DPMRATrustedDPMRAs
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(852) [0000000000DAC6D0] ACTIVITY InitializeSecurity added DM perm for DPMRADmTrustedMachines
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(97) NORMAL Found 2 writer account(s) under VSS access control.
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(165) NORMAL Added writer account [0] 'NT Authority\NetworkService' to the DACL.
    15B0 13A8 04/19 13:45:25.113 03 runtime.cpp(165) NORMAL Added writer account [1] 'NT SERVICE\SQLWriter' to the DACL.
    15B0 13A8 04/19 13:45:25.128 03 runtime.cpp(1438) WARNING Failed: Hr: = [0x80070002] : Encountered Failure: : lVal : HRESULT_FROM_WIN32( ::RegGetValueW((( HKEY ) (ULONG_PTR)((LONG)0x80000002) ), L"Software\\Microsoft\\Microsoft Data Protection Manager\\Configuration", L"CredentialPassingAllowed", 0x00000010, 0, &dwRegValue, &dwBuffer) )
    15B0 13A8 04/19 13:45:25.128 03 runtime.cpp(1447) NORMAL CredentialPassingAllowed: Checked reg value CredentialPassingAllowed and returning 0
    15B0 13A8 04/19 13:45:25.128 03 runtime.cpp(979) NORMAL Setting DCOM Authn service to RPC_C_AUTHN_GSS_NEGOTIATE
    15B0 13A8 04/19 13:45:25.128 22 genericthreadpool.cpp(824) [0000000000DAC6D0] NORMAL Hr: = [0x80070002] CGenericThreadPool::m_dwMaximumNumberOfThreads[20]
    15B0 13A8 04/19 13:45:25.159 20 commengine.cpp(345) [0000000001247130] ACTIVITY Hr: = [0x80070002] ConnectionNoActivityTimeout: 300
    15B0 13A8 04/19 13:45:25.159 20 commengine.cpp(359) [0000000001247130] ACTIVITY Hr: = [0x80070002] m_dwConnectionNoActivityTimeoutForSlowReceiver: 3600
    15B0 028C 04/19 13:45:25.159 03 timer.cpp(490) ACTIVITY CTimerMgr::TimerThread
    15B0 028C 04/19 13:45:25.159 03 timer.cpp(503) [00000000012A1358] ACTIVITY CTimerMgr::TimerThreadInternal
    15B0 13A8 04/19 13:45:25.159 29 dpmra.cpp(182) [0000000000DAA360] NORMAL CDPMRA::Initialize [0000000000DAA360]
    15B0 13A8 04/19 13:45:25.159 31 vaagent.cpp(227) NORMAL CVAAgent: Initialize globals
    15B0 13A8 04/19 13:45:25.175 29 mta.cpp(126) ACTIVITY CMTA::AddSchemasToSchemaCache => Initialize loaded schemas
    15B0 13A8 04/19 13:45:25.206 29 dpmra.cpp(200) [0000000000DAA360] NORMAL CDPMRA::Initialize loaded schemas 
    15B0 13A8 04/19 13:45:25.206 22 genericthreadpool.cpp(75) [00000000012A9930] NORMAL CGenericThreadPool: constructor
    15B0 13A8 04/19 13:45:25.206 22 genericthreadpool.cpp(824) [00000000012A9930] NORMAL Hr: = [0x80070002] CGenericThreadPool::m_dwMaximumNumberOfThreads[256]

    Thursday, April 19, 2018 2:54 PM
  • Hi, 

    Thanks for the update. The error that i see in the DPMRA logs as per that error could you please check few things such as:

    This could be a problem with DCOM permissions on the primary server, so please add the secondary DPM computer account in the following groups on primary DPM server:

    DPMRADmTrustedMachines

    DPMRADCOMTrustedMachines 

    Distributed COM Users

    After that also make sure that there are no data sources that are protected by primary server first and now are protected by secondary DPM server directly as this could cause inconsistencies in the DPM database.

    Most of the DPM crash are caused by inconsistencies in the DPM database. In case the problem persists i request please open a ticket with Microsoft Support.

    -----------------------------------------------------------------------

    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. This posting is provided "AS IS" with no warranties, and confers no rights.


    Friday, April 20, 2018 4:49 AM
  • Hi Rahul,

    All the group membership was ok. SetDPMServer was also ok. But the problem persisted. I tried to completely reinstall DPM-s on primary and secondary server (with removing DPM databases and data stores) - no luck. Finally, I ended up with a full reinstallation of primary server - Windows OS, SQL, DPM. Sometimes spending a day with reinstallation will save weeks fighting with MS bugs.

    By the way - DPM agent is installed by default on DPM server - no way to uninstall from "program and features".

    Thanks !

    Sunday, April 22, 2018 8:13 AM