none
DPM 2019 Service crashing repeatedly on secondary server RRS feed

  • Question

  • Multiple crashes of DPM Server today with the following event detail raised:


    Unable to connect to the DPM database because the database is in an inconsistent state.

    Problem Details:
    <FatalServiceError><__System><ID>19</ID><Seq>291</Seq><TimeCreated>11/18/2019 5:15:35 PM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>DBCorruptionException</ExceptionType><ExceptionMessage>Couldn't access DataSourceInfo '00000000-0000-0000-0000-000000000000' in the database</ExceptionMessage><ExceptionDetails>Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException: Couldn't access DataSourceInfo '00000000-0000-0000-0000-000000000000' in the database
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.DataSourceInfo.GetDataSourceInfo(SqlContext ctx, Guid datasourceIdLocal)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogDataset(Guid dependencyGroupId, DbContext dbContext, DataSourceVolumeMappingXmlType datasetXml, DataSourceInfo dsConfigInfo, Guid datasetId, Guid datasourceId, Replica replica, DatasetContextType datasetContext, BackupTypeType backupType, Guid verbId, DatasetLocations location, Guid referentialDatasetId)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogDataset(Guid dependencyGroupId, DbContext dbContext, DataSourceVolumeMappingXmlType datasetXml, DataSourceInfo dsConfigInfo, Guid datasetId, Guid datasourceId, Replica replica, DatasetContextType datasetContext, BackupTypeType backupType, Guid verbId, DatasetLocations location)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogReplicaDatasets(Guid taskId, Guid datasourceId, Guid verbId, Guid dependencyGroupId, String replicaVolumeGuidName)
       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.RAPostBackupBlock.RecatalogReplicaDatasets(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.ConnectionPoint.Execute(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.Engine.ChangeState(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskInstance.Process(Object dummy)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.FsmThreadFunction.Function(Object taskThreadContextObj)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
       at System.Threading.ThreadPoolWorkQueue.Dispatch()</ExceptionDetails></FatalServiceError>


    .NET Architect

    Monday, November 18, 2019 5:19 PM

All replies

  • Hi Joel,

    Could you tell us how your DPM environment looks like more in detail?

    What are you trying to do when you're receiving this error / console crash?

    What workloads are you backing up?

    Also check the DPM logs for more clues:

    • C:\Program Files\Microsoft System Center\DPM\DPM\Temp\MSDPMCurr.errlog

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, November 18, 2019 8:31 PM
  • Actually I think I found the error.  Working on the test/verification.

    I just set up a secondary DPM server to protect my primary server.  Both running DPM 2019/Windows Server 2019.

    Just found a DCOM error in my system log.  The potential fix was to launch DCOMCNFG on my secondary server, and grant Remote Launch permissions to the primary server computer for the MSDPM Task Executor Application.

    Here is the error message

    The application-specific permission settings do not grant Remote Launch permission for the COM Server application with CLSID 
    {CC7D8A18-C386-48D5-9906-4FFC77A39788}
     and APPID 
    {A3B9D3F4-2477-4F95-B2D1-F75B0FDF2A0E}
     to the user <domain>\<primaryDPMserver> SID (xxxxxx) from address XXX.XXX.XXX.XXX running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool.

    Did some digging in regedit HKEY_CLASSES_ROOT\CLSID to find the COM server.

    I'll post if the problem disappears after this fix.


    .NET Architect

    Tuesday, November 19, 2019 8:13 PM
  • Great to hear you have progress on your issue, let us know how everything goes.

    You’ll also find the firewall and permissions required over here:

    https://docs.microsoft.com/en-us/system-center/dpm/plan-dpm-deployment?view=sc-dpm-2019#BKMK_Firewall

    If you have any questions, don’t hesitate to ask!


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, November 19, 2019 8:25 PM
  • There seems to possibly be two issues.  My Admin Console crashed 10 minutes ago with the following logged message.

    An unexpected error caused a failure for process 'mmc'.  Restart the DPM process 'mmc'.

    Problem Details:
    <FatalServiceError><__System><ID>19</ID><Seq>0</Seq><TimeCreated>11/19/2019 9:24:34 PM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>MissingMethodException</ExceptionType><ExceptionMessage>No parameterless constructor defined for this object.</ExceptionMessage><ExceptionDetails>System.MissingMethodException: No parameterless constructor defined for this object.
       at System.RuntimeTypeHandle.CreateInstance(RuntimeType type, Boolean publicOnly, Boolean noCheck, Boolean&amp; canBeCached, RuntimeMethodHandleInternal&amp; ctor, Boolean&amp; bNeedSecurityCheck)
       at System.RuntimeType.CreateInstanceSlow(Boolean publicOnly, Boolean skipCheckThis, Boolean fillCache, StackCrawlMark&amp; stackMark)
       at System.Activator.CreateInstance(Type type, Boolean nonPublic)
       at System.Activator.CreateInstance(Type type)
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.PageController.GetSingletonObject(Type objectType)
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.MonitoringPage.MonitoringPage.get_DPMServer()
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.MonitoringPage.JobsView.DeRegisterForEvents()
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.MonitoringPage.JobsView.Dispose(Boolean disposing)
       at System.ComponentModel.Component.Dispose()
       at System.Windows.Forms.Control.Dispose(Boolean disposing)
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.MonitoringPage.MonitoringPage.Dispose(Boolean disposing)
       at System.ComponentModel.Component.Dispose()
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.PageController.OnHandleDestroyed(EventArgs e)
       at System.Windows.Forms.Control.WmDestroy(Message&amp; m)
       at System.Windows.Forms.Control.WndProc(Message&amp; m)
       at System.Windows.Forms.UserControl.WndProc(Message&amp; m)
       at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)</ExceptionDetails></FatalServiceError>


    .NET Architect

    Tuesday, November 19, 2019 9:33 PM
  • And another Admin Console crash with this stack dump:

    An unexpected error caused a failure for process 'mmc'.  Restart the DPM process 'mmc'.

    Problem Details:
    <FatalServiceError><__System><ID>19</ID><Seq>0</Seq><TimeCreated>11/19/2019 9:34:03 PM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>CmdletInvocationException</ExceptionType><ExceptionMessage>Cannot remove the specified item because it was not found in the specified Collection.</ExceptionMessage><ExceptionDetails>System.Management.Automation.CmdletInvocationException: Cannot remove the specified item because it was not found in the specified Collection. ---&gt; System.ArgumentException: Cannot remove the specified item because it was not found in the specified Collection.
       at System.Collections.CollectionBase.System.Collections.IList.Remove(Object value)
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.Menu.MenuItemCollection.ClearItemsFromMMC()
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.Menu.MenuItemCollection.OnClear()
       at System.Collections.CollectionBase.Clear()
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.MenuController.Dispose()
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.Cmdlet.DisposeObject.CallBack(Object[] paramsList)
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.Cmdlet.DisposeObject.ProcessRecord()
       at System.Management.Automation.CommandProcessor.ProcessRecord()
       --- End of inner exception stack trace ---
       at System.Management.Automation.Runspaces.PipelineBase.Invoke(IEnumerable input)
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.UICommon.CmdLetWrapper.Invoke()
       at Microsoft.Internal.EnterpriseStorage.MmcContainer.PageController.OnHandleDestroyed(EventArgs e)
       at System.Windows.Forms.Control.WmDestroy(Message&amp; m)
       at System.Windows.Forms.Control.WndProc(Message&amp; m)
       at System.Windows.Forms.UserControl.WndProc(Message&amp; m)
       at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)</ExceptionDetails></FatalServiceError>


    .NET Architect

    Tuesday, November 19, 2019 9:36 PM
  • The service crashed again as well. 

    The last FATAL error in the log is this:

    3BDC 3084 11/19 21:35:11.133 01 TaskExecutor.cs(843) 91996BA6-2D35-4273-98EC-327E4344BC52 FATAL Task stopped (state=Failed, result=RmGenericError; -2147418113; WindowsHResult), search "Task Diagnostic Information" for details.

    Here appears to be the Task Diagnostic Information preceding this fatal error.

    3BDC 3084 11/19 21:35:11.117 27 BackupMachine.cs(2545) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING BackupMachine : FAILURE - BACKUP, errorCode=RmGenericError
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING Task Diagnostic Information - <?xml version="1.0" encoding="utf-16"?>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING <TaskExecutionContext>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmWriterId>a65faa63-5ea8-4ebc-9dbd-a0c4db26912a</PrmWriterId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmDatasourceId>9c39b45c-4ce5-4cc4-aead-4728f42a029d</PrmDatasourceId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmActiveNodeName>CANDPMGR.co.island.wa.us</PrmActiveNodeName>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmLogicalReplicaId>16d228ee-b7fc-48ea-b635-eac9f8dc9f21</PrmLogicalReplicaId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmDatasetId>ca972084-8b4e-4983-af77-6e079d3ee114</PrmDatasetId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmPhysicalReplicaId>29f08d3a-4199-4c2b-a21d-3dc805bacce9</PrmPhysicalReplicaId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmReplicaValidity>Valid</PrmReplicaValidity>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmReplicaStatus>Replicating</PrmReplicaStatus>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <PrmOwnerLockId>91996ba6-2d35-4273-98ec-327e4344bc52</PrmOwnerLockId>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <TEVerb>DeltaReplicateForDR</TEVerb>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <TEErrorState>Backup.OnAbort(1)RAForRead.RADeleteWorkItemPending</TEErrorState>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   <TEErrorDetails>&lt;?xml version="1.0" encoding="utf-16"?&gt;
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING &lt;q1:ErrorInfo ErrorCode="104" DetailedCode="-2147418113" DetailedSource="2" ExceptionDetails="" xmlns:q1="http://schemas.microsoft.com/2003/dls/GenericAgentStatus.xsd"&gt;
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   &lt;q1:Parameter Name="replicaid" Value="16d228ee-b7fc-48ea-b635-eac9f8dc9f21" /&gt;
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING   &lt;q1:Parameter Name="datasourceid" Value="9c39b45c-4ce5-4cc4-aead-4728f42a029d" /&gt;
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING &lt;/q1:ErrorInfo&gt;</TEErrorDetails>
    3BDC 3084 11/19 21:35:11.117 01 TaskInstance.cs(803) 91996BA6-2D35-4273-98EC-327E4344BC52 WARNING </TaskExecutionContext>


    .NET Architect

    Tuesday, November 19, 2019 9:42 PM
  • Workloads I'm backing up include SharePoint, SQL Server, File Shares, Hyper-V Host and Guest level backups, plus Exchange.

    .NET Architect

    Tuesday, November 19, 2019 9:46 PM
  • Currently trying to isolate if it's a particular Data Source (with critical error) that's causing this, using process of elimination.


    .NET Architect

    Tuesday, November 19, 2019 9:49 PM
  • Tuesday, November 19, 2019 10:07 PM
  • Not on my secondary DPM Server, where the error is happening.  My primary server, which is being protected, has been upgraded from DPM 2016.

    I just got another service crash error, different from the rest.

    Unable to connect to the DPM database because the database is in an inconsistent state.

    Problem Details:
    <FatalServiceError><__System><ID>19</ID><Seq>1199</Seq><TimeCreated>11/19/2019 10:23:30 PM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>DBCorruptionException</ExceptionType><ExceptionMessage>Couldn't access DataSourceInfo '00000000-0000-0000-0000-000000000000' in the database</ExceptionMessage><ExceptionDetails>Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException: Couldn't access DataSourceInfo '00000000-0000-0000-0000-000000000000' in the database
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.DataSourceInfo.GetDataSourceInfo(SqlContext ctx, Guid datasourceIdLocal)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogDataset(Guid dependencyGroupId, DbContext dbContext, DataSourceVolumeMappingXmlType datasetXml, DataSourceInfo dsConfigInfo, Guid datasetId, Guid datasourceId, Replica replica, DatasetContextType datasetContext, BackupTypeType backupType, Guid verbId, DatasetLocations location, Guid referentialDatasetId)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogDataset(Guid dependencyGroupId, DbContext dbContext, DataSourceVolumeMappingXmlType datasetXml, DataSourceInfo dsConfigInfo, Guid datasetId, Guid datasourceId, Replica replica, DatasetContextType datasetContext, BackupTypeType backupType, Guid verbId, DatasetLocations location)
       at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.PrmCatalog.RecatalogReplicaDatasets(Guid taskId, Guid datasourceId, Guid verbId, Guid dependencyGroupId, String replicaVolumeGuidName)
       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.RAPostBackupBlock.RecatalogReplicaDatasets(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.ConnectionPoint.Execute(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.Engine.ChangeState(Message msg)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskInstance.Process(Object dummy)
       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.FsmThreadFunction.Function(Object taskThreadContextObj)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
       at System.Threading.ThreadPoolWorkQueue.Dispatch()</ExceptionDetails></FatalServiceError>


    .NET Architect

    Tuesday, November 19, 2019 10:32 PM
  • Might want to disable the backups (one-by-one if possible) on the secondary DPM server for now, and try to start eliminate things that may cause the crashes.

    Then keep monitoring the situation and check the MSDPMCurr.errlog.

    When you set up a secondary DPM, did you follow the documentation thoroughly over here?
    https://docs.microsoft.com/en-us/system-center/dpm/back-up-the-dpm-server?view=sc-dpm-2019


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, November 19, 2019 10:47 PM
  • Hi,

    Just checking to see if you have any update on your issue?

    If your issue was resolved, may I ask you to mark all the answers that helped you? This way it will also help others in the future who face the same challenge. Many thanks in advance!

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, December 11, 2019 9:33 PM