DPM 2012 R2 crashing when trying to catalog tape RRS feed

  • Question

  • Hi,

    We are trying to restore data from tapes taken in Jan 2016. For some reason DPM consistently crashes when trying to either catalog a tape or mark it as free. The issue seems very specific to individual tapes. A verbose log of the crash is below, seems to be something to do with the MediaPool value being all zeros and an expectation that 'dbo.prc_MM_MediaPool_GetOnMediaPoolId' will return a value (it returns no rows when checked manually with the params shown in the log). DPM version is 4.2.1553.0 Rollup Update 11.

    Has anyone come across a fix for this?

    1CCC 14EC 12/22 07:04:15.787 28 MediaPool.cs(350)   NORMAL Found PGSetId 0aab762d-dd8e-494c-bbf5-a3a995080154 for MediaPoolId 6af0907a-cea7-4d0d-8b4c-f562e03a41fc
    1CCC 14EC 12/22 07:04:15.787 28 MediaFactory.cs(794)   VERBOSE 3 Media in List of media for Dataset 349bf1c7-b156-4a59-8fe0-f982c7991099 read from DB
    1CCC 14EC 12/22 07:04:15.787 28 MediaFactory.cs(1330)   VERBOSE Executing SQL script: dbo.prc_MM_Media_GetOnMediaId
    1CCC 14EC 12/22 07:04:15.787 28 MediaFactory.cs(1455)   VERBOSE ArchiveMedia 96bb5440-ecf6-4c37-ba6d-e624d750810a read from DB
    1CCC 14EC 12/22 07:04:15.787 68 LibraryManagementUtils.cs(141)   VERBOSE TapeWritePeriodRatio = 0.15
    1CCC 14EC 12/22 07:04:15.787 28 Dataset.cs(1114)   VERBOSE 2 Datasets in List of Datasets for media 96bb5440-ecf6-4c37-ba6d-e624d750810a read from DB
    1CCC 14EC 12/22 07:04:15.787 28 MediaPool.cs(329)   WARNING MediaPool 00000000-0000-0000-0000-000000000000 does not exist.
    1CCC 14EC 12/22 07:04:15.803 28 MMCatalog.cs(11198)   VERBOSE released GlobalDB Lock
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1433)   VERBOSE ValidateReturn for dbo.prc_MM_MediaPool_GetOnMediaPoolId is 50001
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1435)   WARNING SqlRetryCommand diag details - SqlCommandText  => Name=dbo.prc_MM_MediaPool_GetOnMediaPoolId, CommandType=StoredProcedure
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1435)   WARNING CommandDiagInfo => CanRetry=True, CommandTimeout=3600
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1435)   WARNING CommandParams   => Count=2, InTx=True
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1435)   WARNING      Param[0]   => ParameterName=@MediaPoolId | Value=00000000-0000-0000-0000-000000000000 | Size=0 | DbType=UniqueIdentifier | Direction=Input | IsNullable=False
    1CCC 14EC 12/22 07:04:15.803 09 Catalog.cs(1435)   WARNING      Param[1]   => ParameterName=@RETURN_VALUE | Value=50001 | Size=0 | DbType=Int | Direction=ReturnValue | IsNullable=False
    1CCC 14EC 12/22 07:04:15.803 28 MMCatalog.cs(11198)   VERBOSE released GlobalDB Lock
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING Caught unhandled exception : Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException: DLS error:  SP='dbo.prc_MM_MediaPool_GetOnMediaPoolId' Error=50001
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlRetryCommand.ValidateReturnValue(String errorTitle)
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext.Close(Boolean disallowReuseConnection)
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext.Close()
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.PRMCatalog.Replica.DbContext.Close()
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.MMDatabase.MMCatalog.MoveMediaToNonAdminPool(Guid mediaId, MediaPoolType mediaPoolType)
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.MMInterface.MediaManager.MoveToNonAdminPool(Guid mediaId, UInt32 mediaPoolType)
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.Engine.CMediaManagerServices.MoveToNonAdminPool(_GUID mediaId, UInt32 mediaPoolType)
    1CCC 14EC 12/22 07:04:15.803 68 DpmThreadPool.cs(121)   WARNING    at Microsoft.Internal.EnterpriseStorage.Dls.Engine.CCoreServices.MoveToNonAdminPool(CCoreServices* , _GUID mediaId, UInt32 mediaPoolType, tagSAFEARRAY** exceptionResult)
    1CCC 14EC 12/22 07:04:15.818 68 DpmThreadPool.cs(149)   WARNING Attempting to write error to the event log DLS error:  SP='dbo.prc_MM_MediaPool_GetOnMediaPoolId' Error=50001
    1CCC 14EC 12/22 07:04:15.818 02 EventManager.cs(143)   NORMAL Logging event to NT log from DpmThreadPool.cs(163): FatalServiceError
    1CCC 14EC 12/22 07:04:15.834 68 DpmThreadPool.cs(170)   WARNING Attempting to invoke Watson
    1CCC 14EC 12/22 07:04:15.849 09 everettexception.cpp(761)   CRITICAL Exception Message = DLS error:  SP='dbo.prc_MM_MediaPool_GetOnMediaPoolId' Error=50001 of type Microsoft.Internal.EnterpriseStorage.Dls.DB.DBCorruptionException, process will terminate after generating dump

    Thursday, December 22, 2016 9:15 AM

All replies

  • Hi Mrphil45

    Did you find a solution to this? I have a similar issue but mine crashes the agent when it tries to do a tape backup for a specific protection group. Same debug error though... 



    Monday, January 9, 2017 7:23 AM
  • Me too. Any solution found?

    Thursday, February 2, 2017 3:24 PM
  • Me too. Any solution found?

    Hi MYTokyo

    Are you using co-location on your tape backups? After running all my protection group tape backups on separate days (they run long) to try an isolate the issue, I was able to determine that it was 1 out of the 5 causing my agent to terminate.

    After trying multiple things I took only that suspect protection group out of my co-location group, agent then stopped terminating. Not ideal, as I have a 4 drive tape library so co-location saves me tapes but at least the backups are working again without the damned agent terminating.

    Hopefully MS will be able to provide a fix for this some day. We are switching our clients from DPM to Veeam as it seems a lot more stable and less buggy.

    Friday, February 3, 2017 5:31 AM
  • Mine is the failed backup due to the virtual tapes shortage. The tapes became "Expired" but DPM crashes with the same error message SP='dbo.prc_MM_MediaPool_GetOnMediaPoolId' Error=50001 when trying to make a tape as FREE. I have tried ForceFree.ps1 too but the same result. I don't want to take the failed protection from the group as it contains other old backup data. Wondered If I could just delete the virtual tape files...
    Friday, February 3, 2017 6:55 AM
  • No solution found yet, seems to be linked to specific tapes as no issues with cataloging other tapes. Also, the same tapes work fine on another install of DPM.
    Saturday, February 4, 2017 11:03 AM
  • @Mrphil45 have you tried a different bar code label on the suspect tapes? If so, might be worth manually removing the suspect tape from the SQL DB...

    @MYTokyo, sounds like a DB corruption... Would suggest you run a SQL trace on the DPM Instance and then perform the same action in the DPM console. This will allow you to see the transact statements performed by the DPM stored procedures that gave the "Error" result. Using this information, you might be able to run an UPDATE statement to "re-associate" the tape with the correct MediaPool GUID.

    I have been running dual DPM 2012 R2 deployment for 4 years now. Have had multiple issues with it, especially the Secondary DPM server in my DR site aka also my tape backups server. Annoying issues like tapes going "suspect" for no reason and then having to hack the database for DPM to use them again... DPMRA and DPMLA agents crashing endlessly for the most obscure reasons...

    Monday, February 6, 2017 6:07 AM
  • Great. I will give a go. Many Thanks
    Monday, February 6, 2017 11:55 AM