none
DPM 2019 error. Event ID 940 from source MSDPM cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted RRS feed

  • Question

  • Hello all,

    I have a bit of a problem with DPM running for any length of time. I am running DPM 2019 with SQL 2017 but every 3 days or so I log into the DPM server to receive an error message telling me that the mmc has crashed. This in general isn’t my issue as relaunching it corrects the problem, my issue is that at some point, usually hours before I log in, all my backups fail. The message I get in DPM tells me that the backup failed because the service wasn’t available, I’ll post that message when I get it next as I’ve already forced it to run another recovery point.

    When I take a look in the event viewer, I get the following

    The description for Event ID 940 from source MSDPM cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

    If the event originated on another computer, the display information had to be saved with the event.

    The following information was included with the event:

    Unable to connect to the database because of a fatal database error. It is unlikely that the database itself has been damaged.  Review the event log and take appropriate action. Make sure that SQL Server is running.

    Problem Details:

    <FatalServiceError><__System><ID>19</ID><Seq>42582</Seq><TimeCreated>12/30/2019 11:30:37 AM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>SqlException</ExceptionType><ExceptionMessage>A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)</ExceptionMessage><ExceptionDetails>System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) ---&gt; System.ComponentModel.Win32Exception (0x80004005): The semaphore timeout period has expired

       at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

       at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)

       at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)

       at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()

       at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()

       at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()

       at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte&amp; value)

       at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean&amp; dataReady)

       at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)

       at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task&amp; task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)

       at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task&amp; task, Boolean&amp; usedCache, Boolean asyncWrite, Boolean inRetry)

       at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean&amp; usedCache, Boolean asyncWrite, Boolean inRetry)

       at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()

       at Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlRetryCommand.ExecuteNonQuery()

       at Microsoft.Internal.EnterpriseStorage.Dls.JobManager.JobDB.UpdateJobInfoHelper(SqlContext sqlContext, Job job)

       at Microsoft.Internal.EnterpriseStorage.Dls.JobManager.JobDB.UpdateJobInfo(Job job)

       at Microsoft.Internal.EnterpriseStorage.Dls.JobManager.ActiveJob.GenerateWorkplanCallback(Guid jobID, Guid taskID, Boolean isSerial, Boolean hasErrors)

       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskExecutor.GenerateWorkplan(Guid jobID, Guid taskID, String jobDef, String taskDef, WorkPlanFlags flags, GenerateWorkplanCallback callback)

       at Microsoft.Internal.EnterpriseStorage.Dls.JobManager.ActiveJob.DispatchRequestCallback(Object state)

       at Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object state)

       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()

       at System.Threading.ThreadPoolWorkQueue.Dispatch()

    ClientConnectionId:5e55e26a-9f23-4d64-bda1-fcf926940284

    Error Number:121,State:0,Class:20</ExceptionDetails></FatalServiceError>

    The message resource is present but the message was not found in the message table

    Unfortunately I haven’t had a lot of time to look into this but what I can tell you is that this message has survived through 3 re-installs and though I can’t say for sure I believe it has something to do with why my server suddenly decides to stop backing up servers. As a side note, when I restart the mmc I can resume backups but while the mmc is failing no scheduled backup takes place, at least from what I can tell.

    Sorry for the lack of information, I got carried away and started forcing the backups before I decided to post here. I will get this error again and when I do I’ll post what I get here

    Meanwhile, I doubt this is a DB issue as I have other applications running off the SQL server with no issues. Any pointers?

    Monday, December 30, 2019 9:40 PM

Answers

  • Hi,

    It looks like you are getting timeout from SQL likely due to SQL not responsive or a query taking a very long time to return.  Is this a remote SQL Server ?  

    Try increasing the timeout from 90 to 180.

    Open Regedit and navigate to:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\DB

    -Here you will see two string values called ConnectionString and GlobalDbConnectionString.

    If you right click and modify on these strings you will see that both of them have same timeout value of 90.

    ConnectionString

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=DPMServername\SQLINstanceNAme;Connect Timeout=90


    GlobalDbConnectionString

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=server=DPMServername\SQLINstanceNAme;Connect Timeout=90

    -Change the Timeout value for both string to be 180

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=DPMServername\SQLINstanceNAme;Connect Timeout=180


    When no DPM jobs are running - Restart MSDPM and "DPM AccessManager Service"

    C:\WINDOWS\system32>net stop "DPM AccessManager Service"
    The DPM AccessManager Service service is stopping......
    The DPM AccessManager Service service was stopped successfully.


    C:\WINDOWS\system32>net start "DPM AccessManager Service"
    The DPM AccessManager Service service is starting..
    The DPM AccessManager Service service was started successfully.


    C:\WINDOWS\system32>net stop msdpm
    The DPM service is stopping..
    The DPM service was stopped successfully.


    C:\WINDOWS\system32>net start msdpm
    The DPM service is starting...
    The DPM service was started successfully.

    Monitor for additional timeout crashes.

    Regards
    Mike Jacquet


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, December 31, 2019 3:41 PM
    Moderator

All replies

  • I connected into the server this morning and it appears the console crashed. Here is that message

    Connection to the DPM service has been lost.

          Review the application event log for information about a possible service shutdown. Verify that the following services are not disabled:
                DPM
                DPM Replication Agent
                SQLAgent$MSSQLSERVER
                MSSQL$MSSQLSERVER
                Virtual Disk Service
          Volume Shadow Copy

    ID: 917

    I also get

    The description for Event ID 51 from source Disk cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

     

    If the event originated on another computer, the display information had to be saved with the event.

     

    The following information was included with the event:

     

    \Device\Harddisk8\DR26446

     

    The handle is invalid

    Tuesday, December 31, 2019 2:03 PM
  • This is the message that one of the failed backups provides

    Affected area: Computer\System Protection
    Occurred since: 12/31/2019 6:32:50 AM
    Description: Recovery point creation jobs for System Protection Computer\System Protection on MyServer have been failing. The number of failed recovery point creation jobs = 1.
     If the data source protected has some dependent data sources (like a SharePoint Farm), then click on the Error Details to view the list of dependent data sources for which recovery point creation failed. (ID 3114)
     The DPM service terminated unexpectedly during completion of the job. The termination may have been caused by a system reboot. (ID 910)
     More information
    Recommended action: Retry the operation.
     Create a recovery point...
    Resolution: To dismiss the alert, click below
     Inactivate

    Tuesday, December 31, 2019 2:21 PM
  • Hi,

    It looks like you are getting timeout from SQL likely due to SQL not responsive or a query taking a very long time to return.  Is this a remote SQL Server ?  

    Try increasing the timeout from 90 to 180.

    Open Regedit and navigate to:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\DB

    -Here you will see two string values called ConnectionString and GlobalDbConnectionString.

    If you right click and modify on these strings you will see that both of them have same timeout value of 90.

    ConnectionString

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=DPMServername\SQLINstanceNAme;Connect Timeout=90


    GlobalDbConnectionString

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=server=DPMServername\SQLINstanceNAme;Connect Timeout=90

    -Change the Timeout value for both string to be 180

    Integrated Security=SSPI;Initial Catalog=DPMDB;Application Name=MSDPM;server=DPMServername\SQLINstanceNAme;Connect Timeout=180


    When no DPM jobs are running - Restart MSDPM and "DPM AccessManager Service"

    C:\WINDOWS\system32>net stop "DPM AccessManager Service"
    The DPM AccessManager Service service is stopping......
    The DPM AccessManager Service service was stopped successfully.


    C:\WINDOWS\system32>net start "DPM AccessManager Service"
    The DPM AccessManager Service service is starting..
    The DPM AccessManager Service service was started successfully.


    C:\WINDOWS\system32>net stop msdpm
    The DPM service is stopping..
    The DPM service was stopped successfully.


    C:\WINDOWS\system32>net start msdpm
    The DPM service is starting...
    The DPM service was started successfully.

    Monitor for additional timeout crashes.

    Regards
    Mike Jacquet


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, December 31, 2019 3:41 PM
    Moderator
  • So I changed those values, restarted services and I just checked the system to see

    Connection to the DPM service has been lost.


          Review the application event log for information about a possible service shutdown. Verify that the following services are not disabled:
                DPM
                DPM Replication Agent
                SQLAgent$MSSQLSERVER
                MSSQL$MSSQLSERVER
                Virtual Disk Service
          Volume Shadow Copy

    ID: 917

    And ….. at 6:41 am all my backups failed. Seems that didn't help
    Thursday, January 2, 2020 8:23 PM
  • I also found this about 6 minutes prior to the backup failing

    The DPM service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.

    Thursday, January 2, 2020 8:49 PM
  • I've been digging deep. This is the only other message I found and this is on the DPM server as well

    The description for Event ID 999 from source MSDPM cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

     

    If the event originated on another computer, the display information had to be saved with the event.

     

    The following information was included with the event:

     

    An unexpected error caused a failure for process 'msdpm'.  Restart the DPM process 'msdpm'.

     

    Problem Details:

    <FatalServiceError><__System><ID>19</ID><Seq>1824</Seq><TimeCreated>1/2/2020 11:34:46 AM</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>NullReferenceException</ExceptionType><ExceptionMessage>Object reference not set to an instance of an object.</ExceptionMessage><ExceptionDetails>System.NullReferenceException: Object reference not set to an instance of an object.

       at Microsoft.Internal.EnterpriseStorage.Dls.Utils.ExceptionPolicy.LogExceptionInternal(TraceProvider traceSession, String exceptionContext, Guid taskId, Exception exception)

       at Microsoft.Internal.EnterpriseStorage.Dls.Utils.ExceptionPolicy.LogException(TraceProvider traceSession, Guid taskId, Exception exception)

       at Microsoft.Internal.EnterpriseStorage.Dls.Utils.ExceptionPolicy.LogException(TraceProvider traceSession, Exception exception)

       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.E14CatalogDataLoader.GetMailBoxes(RecoverableObject mdbRO, String domainName)

       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.E14CatalogDataLoader.GetMailBoxesByRetry(RecoverableObject mdbRO, String domainName)

       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.E14CatalogDataLoader.AddMailboxes()

       at Microsoft.Internal.EnterpriseStorage.Dls.Prm.ExchangeCatalogDataLoader.CatalogLoadPendingAsync(Object msgContext)

       at Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object state)

       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()

       at System.Threading.ThreadPoolWorkQueue.Dispatch()</ExceptionDetails></FatalServiceError>

     

     

    The message resource is present but the message was not found in the message table

     

    Thursday, January 2, 2020 9:50 PM
  • I've been watching the system over the past month and it has vastly improved. I still get quite a few errors that seem to hit random servers but I haven't had this timeout in over two weeks. I can only assume your SQL timeout fixed it but was delayed for some reason
    Thursday, January 23, 2020 7:49 PM