none
DPM crash ID 917 after update rollup 10 RRS feed

  • Question

  • Hello

    I have this week had outage on our DPM 2012 R2 server.  The problems started after i applied the latest Security updates to our DPM server MS16-082 till MS16-065 including DPM Update Rollup 10.

    At first the backup jobs to disk where running OK but the backups to VTL tape crashed with ID 917 after running for 2 -3 hours.

    ---
    Connection to the DPM service has been lost

    Review the application event log for information about a possible service shutdown.
    Verify that the following services are not disabled:
     DPM
     DPM Replication Agent
     SQLAgent$MSDPM2010
     MSSQL$MSDPM2010
     Virtual Disk Service
     Volume Shadow Copy

    ID: 917
    ---

    From the logs

    Problem Details:

    <DpmWriterEvent><__System><ID>30</ID><Seq>0</Seq><TimeCreated>7/20/2016 6:20:23 AM</TimeCreated><Source>d:\btvsts\1328\private\product\tapebackup\dpswriter\dpswriterwindowsservice.cpp</Source><Line>288</Line><HasError>True</HasError></__System><DetailedCode>-2146232060</DetailedCode></DpmWriterEvent>

    The first actions taken was to update the server with the latest drivers as some KB indicated this could be a bad driver issue

    After the driver update disk to disk backups where still ok but as soon as i ran backup to tape the same crach occured again

    I then attempted to resync the database

    DPMSYNC -syc

    After this the error ID 917 was gone but all protection groups where inconsistent.  After running consistency check after a couple of hours i again had the crash ID 917

    I checked the DPMDB and the Database consistency check - OK 

    The last thing i tried was to run this against the DPMDB

    UPDATE tbl_TE_TaskTrail     
    SET ExecutionState = 3,     
    LastStateName = ‘Failure’,     
    StoppedDateTime = GetUtcDate()     
    WHERE ExecutionState NOT IN (2,3)

    UPDATE tbl_JM_JobTrail     
    SET JobState= ‘Failed’,     
    EndDateTime = GetUtcDate()     
    WHERE jobstate= ‘Execute’ OR jobstate= ‘Retire

    Now the DPM crashes as soon as the DPM service tries to start

    Problem Details:

    <FatalServiceError><__System><ID>19</ID><Seq>0</Seq><TimeCreated>21.07.2016 09:00:20</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>DBCorruptionException</ExceptionType><ExceptionMessage>StoppedDateTime is NULL when in a stopped state</ExceptionMessage><ExceptionDetails>Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException: StoppedDateTime is NULL when in a stopped state

       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.DB.Task.GetInstance(Guid taskID)

       at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskExecutor.ExecuteTask(Guid jobID, Guid taskID, WindowsIdentity runAsIdentity, ExecuteTaskFlags flags, ExecuteTaskCallback callback)

       at Microsoft.Internal.EnterpriseStorage.Dls.JobManager.ActiveJob.DispatchRequestCallback(Object state)

       at Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object state)

       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)

       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()

       at System.Threading.ThreadPoolWorkQueue.Dispatch()</ExceptionDetails></FatalServiceError>

    I include the latest detailed logs from C:\Program Files\Microsoft System Center 2012 R2\DPM\DPM\Temp

    ProcessId

    ThreadId

    Date

    Time

    ComponentCode

    FileName(FileLine)

    This

    TaskId

    Level

    TraceMessage

    24A0

    3708

    Jul 21

    00:10:29.824

    9

    serviceutils.cpp(617)

    WARNING

    Caught Dls exception:   Microsoft.Internal.EnterpriseStorage.Dls.Utils.DlsException: Exception of   type 'Microsoft.Internal.EnterpriseStorage.Dls.Utils.DlsException' was   thrown.

    24A0

    3708

    Jul 21

    00:10:29.824

    9

    serviceutils.cpp(617)

    WARNING

         at   Microsoft.Internal.EnterpriseStorage.Dls.Utils.ApplicationSettings.GetCBPVersion(String&   version)

    24A0

    3708

    Jul 21

    00:10:29.824

    9

    serviceutils.cpp(617)

    WARNING

         at Microsoft.Internal.EnterpriseStorage.Dls.Engine.CIntentServices.GetCBPVersionOnDPMServer(UInt16**   cbpVersionBSTR)

    24A0

    3708

    Jul 21

    00:10:29.824

    9

    serviceutils.cpp(617)

    WARNING

         at Microsoft.Internal.EnterpriseStorage.Dls.Engine.CCoreServices.GetCBPVersionOnDPMServer(CCoreServices*   , UInt16** cbpVersionBSTR, tagSAFEARRAY** exceptionResult)

    24A0

    3708

    Jul 21

    00:10:29.824

    9

    serviceutils.cpp(617)

    WARNING

    *** Mojito error was:   CBPNotInstalled; 0; None

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

    Caught unhandled exception :   Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException: StoppedDateTime   is NULL when in a stopped state

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at   Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.DB.Task.GetInstance(Guid   taskID)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at   Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskExecutor.ExecuteTask(Guid   jobID, Guid taskID, WindowsIdentity runAsIdentity, ExecuteTaskFlags flags,   ExecuteTaskCallback callback)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at   Microsoft.Internal.EnterpriseStorage.Dls.JobManager.ActiveJob.DispatchRequestCallback(Object   state)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at   Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object   state)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at System.Threading.ExecutionContext.RunInternal(ExecutionContext   executionContext, ContextCallback callback, Object state, Boolean   preserveSyncCtx)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at System.Threading.ExecutionContext.Run(ExecutionContext   executionContext, ContextCallback callback, Object state, Boolean   preserveSyncCtx)

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(121)

    WARNING

         at System.Threading.ThreadPoolWorkQueue.Dispatch()

    24A0

    25F8

    Jul 21

    00:10:30.215

    1

    TaskExecutor.cs(429)

    2042028D-40FD-4C62-9A95-00B656BBC555

    WARNING

    ExecuteTask(job=d9b7a928-f11a-4583-b140-fcaca19ca8cc)   - ExecuteTask received for finished task, returning

    24A0

    1C78

    Jul 21

    00:10:30.215

    68

    DpmThreadPool.cs(149)

    WARNING

    Attempting to write error to the   event log StoppedDateTime is NULL when in a stopped state

    24A0

    25F8

    Jul 21

    00:10:30.215

    16

    ActiveJob.cs(745)

    WARNING

     Fail: Task '2042028d-40fd-4c62-9a95-00b656bbc555'   failed with error during execution.

    24A0

    2AA0

    Jul 21

    00:10:30.230

    16

    ActiveJob.cs(745)

    WARNING

     Fail: Task   '103f74bc-b581-4d3f-886d-62bb959f3802' failed with error during execution.

    24A0

    1C78

    Jul 21

    00:10:30.230

    68

    DpmThreadPool.cs(170)

    WARNING

    Attempting to invoke Watson

    24A0

    1C78

    Jul 21

    00:10:30.246

    9

    everettexception.cpp(761)

    CRITICAL

    Exception Message = StoppedDateTime   is NULL when in a stopped state of type   Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException, process   will terminate after generating dump

    Thursday, July 21, 2016 9:58 AM

All replies

  • Just as an update on my progress i have restored the DPMDB to before i ran this script

    UPDATE tbl_TE_TaskTrail     
    SET ExecutionState = 3,     
    LastStateName = ‘Failure’,     
    StoppedDateTime = GetUtcDate()     
    WHERE ExecutionState NOT IN (2,3)

    UPDATE tbl_JM_JobTrail     
    SET JobState= ‘Failed’,     
    EndDateTime = GetUtcDate()     
    WHERE jobstate= ‘Execute’ OR jobstate= ‘Retire

    DPM service is now able to start, but i presume i will still face some issues

    Thursday, July 21, 2016 11:38 AM
  • Next progressive updateMy Backup enviroment looks like this

    DPM2012 R2 U10, DL380 with local disk pool for D2D

    StoreOnce VTL connected via iSCSI (Teamed 3x1GB interfaces) 

    MSL LTO5 tape library

    After rolling back my DPMDB to when my problems started I have only one error which I have isolated to the tape backups to the VTL

    Description: DPM encountered a critical error while performing an I/O operation on the tape VTL test-00000002 (HV001MVT) (Barcode - HV001MVT) in Drive Hewlett Packard LTO Ultrium-5 drive (CZ16209301). (ID 30101 Details: The request could not be performed because of an I/O device error (0x8007045D))

    Disk to disk backups are fine and Tape backups direct to the MSL LTO drive are also fine

    Note: Everything was working fine for months before Roll up 10


    • Edited by KTM.Scuba Friday, July 22, 2016 10:14 AM
    Friday, July 22, 2016 10:13 AM
  • Update,

    I was able to get backups working again for a day by increasing the tapes avaialble in the VTL library. 

    However the backup crashed again the following day with this error

    Log Name:      Application
    Source:        MSDPM
    Date:          7/28/2016 10:21:44 PM
    Event ID:      999
    Task Category: None
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:     
    Description:
    The description for Event ID 999 from source MSDPM cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

    If the event originated on another computer, the display information had to be saved with the event.

    The following information was included with the event:

    An unexpected error caused a failure for process 'msdpm'.  Restart the DPM process 'msdpm'.

    Problem Details:
    <FatalServiceError><__System><ID>19</ID><Seq>18318</Seq><TimeCreated>28.07.2016 20:21:44</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>IndexOutOfRangeException</ExceptionType><ExceptionMessage>Index was outside the bounds of the array.</ExceptionMessage><ExceptionDetails>System.IndexOutOfRangeException: Index was outside the bounds of the array.
       at Microsoft.Internal.EnterpriseStorage.Dls.MMInterface.MMProvideMediaToReadDataset.QueryResourcePreference()
       at Microsoft.Internal.EnterpriseStorage.Dls.MM.LibraryTE.MMCopyLoop.QueryResourcePreference()
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.TaskElement.ReserveResources(Dictionary`2 libraryResourcesCache)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.JobElement.ReserveResourcesToOneTask(Dictionary`2 libraryResourcesCache)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.PriorityQueue.ServeTasksInQueue(Message msg, Boolean waitForResources)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMCommonBlocks.ReserveResourcesBlock.DoServeTasksInQueueAndPostCheckMessage(Object msgObject)
       at Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object state)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
       at System.Threading.ThreadPoolWorkQueue.Dispatch()</ExceptionDetails></FatalServiceError>


    the message resource is present but the message is not found in the string/message table

    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="MSDPM" />
        <EventID Qualifiers="0">999</EventID>
        <Level>2</Level>
        <Task>0</Task>
        <Keywords>0x80000000000000</Keywords>
        <TimeCreated SystemTime="2016-07-28T20:21:44.000000000Z" />
        <EventRecordID>1780740</EventRecordID>
        <Channel>Application</Channel>
        <Computer>  </Computer>
        <Security />
      </System>
      <EventData>
        <Data>An unexpected error caused a failure for process 'msdpm'.  Restart the DPM process 'msdpm'.

    Problem Details:
    &lt;FatalServiceError&gt;&lt;__System&gt;&lt;ID&gt;19&lt;/ID&gt;&lt;Seq&gt;18318&lt;/Seq&gt;&lt;TimeCreated&gt;28.07.2016 20:21:44&lt;/TimeCreated&gt;&lt;Source&gt;DpmThreadPool.cs&lt;/Source&gt;&lt;Line&gt;163&lt;/Line&gt;&lt;HasError&gt;True&lt;/HasError&gt;&lt;/__System&gt;&lt;ExceptionType&gt;IndexOutOfRangeException&lt;/ExceptionType&gt;&lt;ExceptionMessage&gt;Index was outside the bounds of the array.&lt;/ExceptionMessage&gt;&lt;ExceptionDetails&gt;System.IndexOutOfRangeException: Index was outside the bounds of the array.
       at Microsoft.Internal.EnterpriseStorage.Dls.MMInterface.MMProvideMediaToReadDataset.QueryResourcePreference()
       at Microsoft.Internal.EnterpriseStorage.Dls.MM.LibraryTE.MMCopyLoop.QueryResourcePreference()
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.TaskElement.ReserveResources(Dictionary`2 libraryResourcesCache)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.JobElement.ReserveResourcesToOneTask(Dictionary`2 libraryResourcesCache)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.PriorityQueue.ServeTasksInQueue(Message msg, Boolean waitForResources)
       at Microsoft.Internal.EnterpriseStorage.Dls.MMCommonBlocks.ReserveResourcesBlock.DoServeTasksInQueueAndPostCheckMessage(Object msgObject)
       at Microsoft.Internal.EnterpriseStorage.Dls.EngineUICommon.DpmThreadPool.Function(Object state)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
       at System.Threading.ThreadPoolWorkQueue.Dispatch()&lt;/ExceptionDetails&gt;&lt;/FatalServiceError&gt;
    </Data>
        <Binary>3C0046006100740061006C0053006500720076006900630065004500720072006F0072003E003C005F005F00530079007300740065006D003E003C00490044003E00310039003C002F00490044003E003C005300650071003E00310038003300310038003C002F005300650071003E003C00540069006D00650043007200650061007400650064003E00320038002E00300037002E0032003000310036002000320030003A00320031003A00340034003C002F00540069006D00650043007200650061007400650064003E003C0053006F0075007200630065003E00440070006D0054006800720065006100640050006F006F006C002E00630073003C002F0053006F0075007200630065003E003C004C0069006E0065003E003100360033003C002F004C0069006E0065003E003C004800610073004500720072006F0072003E0054007200750065003C002F004800610073004500720072006F0072003E003C002F005F005F00530079007300740065006D003E003C0045007800630065007000740069006F006E0054007900700065003E0049006E006400650078004F00750074004F006600520061006E006700650045007800630065007000740069006F006E003C002F0045007800630065007000740069006F006E0054007900700065003E003C0045007800630065007000740069006F006E004D006500730073006100670065003E0049006E00640065007800200077006100730020006F007500740073006900640065002000740068006500200062006F0075006E006400730020006F00660020007400680065002000610072007200610079002E003C002F0045007800630065007000740069006F006E004D006500730073006100670065003E003C0045007800630065007000740069006F006E00440065007400610069006C0073003E00530079007300740065006D002E0049006E006400650078004F00750074004F006600520061006E006700650045007800630065007000740069006F006E003A00200049006E00640065007800200077006100730020006F007500740073006900640065002000740068006500200062006F0075006E006400730020006F00660020007400680065002000610072007200610079002E000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D0049006E0074006500720066006100630065002E004D004D00500072006F0076006900640065004D00650064006900610054006F00520065006100640044006100740061007300650074002E00510075006500720079005200650073006F00750072006300650050007200650066006500720065006E0063006500280029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D002E004C00690062007200610072007900540045002E004D004D0043006F00700079004C006F006F0070002E00510075006500720079005200650073006F00750072006300650050007200650066006500720065006E0063006500280029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D00440061007400610062006100730065002E005400610073006B0045006C0065006D0065006E0074002E0052006500730065007200760065005200650073006F00750072006300650073002800440069006300740069006F006E006100720079006000320020006C006900620072006100720079005200650073006F00750072006300650073004300610063006800650029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D00440061007400610062006100730065002E004A006F00620045006C0065006D0065006E0074002E0052006500730065007200760065005200650073006F007500720063006500730054006F004F006E0065005400610073006B002800440069006300740069006F006E006100720079006000320020006C006900620072006100720079005200650073006F00750072006300650073004300610063006800650029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D00440061007400610062006100730065002E005000720069006F007200690074007900510075006500750065002E00530065007200760065005400610073006B00730049006E005100750065007500650028004D0065007300730061006700650020006D00730067002C00200042006F006F006C00650061006E002000770061006900740046006F0072005200650073006F007500720063006500730029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E004D004D0043006F006D006D006F006E0042006C006F0063006B0073002E0052006500730065007200760065005200650073006F007500720063006500730042006C006F0063006B002E0044006F00530065007200760065005400610073006B00730049006E005100750065007500650041006E00640050006F007300740043006800650063006B004D0065007300730061006700650028004F0062006A0065006300740020006D00730067004F0062006A0065006300740029000D000A002000200020006100740020004D006900630072006F0073006F00660074002E0049006E007400650072006E0061006C002E0045006E0074006500720070007200690073006500530074006F0072006100670065002E0044006C0073002E0045006E00670069006E0065005500490043006F006D006D006F006E002E00440070006D0054006800720065006100640050006F006F006C002E00460075006E006300740069006F006E0028004F0062006A0065006300740020007300740061007400650029000D000A00200020002000610074002000530079007300740065006D002E0054006800720065006100640069006E0067002E0045007800650063007500740069006F006E0043006F006E0074006500780074002E00520075006E0049006E007400650072006E0061006C00280045007800650063007500740069006F006E0043006F006E007400650078007400200065007800650063007500740069006F006E0043006F006E0074006500780074002C00200043006F006E007400650078007400430061006C006C006200610063006B002000630061006C006C006200610063006B002C0020004F0062006A006500630074002000730074006100740065002C00200042006F006F006C00650061006E00200070007200650073006500720076006500530079006E00630043007400780029000D000A00200020002000610074002000530079007300740065006D002E0054006800720065006100640069006E0067002E0045007800650063007500740069006F006E0043006F006E0074006500780074002E00520075006E00280045007800650063007500740069006F006E0043006F006E007400650078007400200065007800650063007500740069006F006E0043006F006E0074006500780074002C00200043006F006E007400650078007400430061006C006C006200610063006B002000630061006C006C006200610063006B002C0020004F0062006A006500630074002000730074006100740065002C00200042006F006F006C00650061006E00200070007200650073006500720076006500530079006E00630043007400780029000D000A00200020002000610074002000530079007300740065006D002E0054006800720065006100640069006E0067002E0051007500650075006500550073006500720057006F0072006B004900740065006D00430061006C006C006200610063006B002E00530079007300740065006D002E0054006800720065006100640069006E0067002E00490054006800720065006100640050006F006F006C0057006F0072006B004900740065006D002E00450078006500630075007400650057006F0072006B004900740065006D00280029000D000A00200020002000610074002000530079007300740065006D002E0054006800720065006100640069006E0067002E0054006800720065006100640050006F006F006C0057006F0072006B00510075006500750065002E0044006900730070006100740063006800280029003C002F0045007800630065007000740069006F006E00440065007400610069006C0073003E003C002F0046006100740061006C0053006500720076006900630065004500720072006F0072003E00</Binary>
      </EventData>
    </Event>

    Monday, August 1, 2016 9:00 AM