locked
DPM 2012 SP1 - DPM Encountered a retryable VSS error (0x800423F3) RRS feed

  • Question

  • Since upgrading to DPM 2012 SP1, we are experiencing a high rate of recovery point failures backing up our Exchange 2010 SP2 CU5-v2 DAG to disk. These backups are run against a replica DB, not the active DB. The error description form DPM is:

    Description: Recovery point creation jobs for Exchange Mailbox Database <DB1> on <DAG node> have been failing. The number of failed recovery point creation jobs = 3. If the data source protected has some dependent data sources (like a SharePoint Farm), then click on the Error Details to view the list of dependent data sources for which recovery point creation failed. (ID 3114) DPM encountered a retryable VSS error. (ID 30112 Details: VssError:The writer experienced a transient error.

    If the backup process is retried, the error may not reoccur. (0x800423F3)) Recommended action: Check the Application Event Log on <DAG node> for the cause of the failure.

    Fix the cause and retry the operation. For more information on this error, go to http://go.microsoft.com/fwlink/?LinkId=132612. Create a recovery point...

    On the DAG node, the corresponding event is:
    The Microsoft Exchange Replication service VSS Writer (Instance c019906b-b192-4f9e-8fa2-85376f166961) failed with error FFFFFFFC when processing the backup completion event.
    Overall the DPM server also seems very sluggish since the upgrade. We are periodically receiving SCOM heartbeat failure alerts and when you try to login when you receive these alerts the server is clearly bogged down. When you look at the system resources though, CPU is low, memory is OK. I am looking for any assistance on the possible cause or steps that can be used to troubleshoot this at a low level.

    Friday, January 18, 2013 3:18 PM

Answers

  • Hi,

    There have not been any changes to the way DPM 2012 backups up Exchange DAG's in SP1, so not sure about the timing.  The Events on the Exchange server do show VSS Writer issues, and DPM seems to be the victim of that.    Are you certain there were NO other changes on the Exchange server since moving to DPM 2012 Sp1 ?   Please work with the Exchange folks in their forum.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, January 22, 2013 1:29 AM
  • To close the loop on this, things have been running fine since my last post. MS support really didn't have much to add and agreed with Mike's point of view that there is nothing really in CU1 (at least that is published) that fixed anything related. In any event, I closed the ticket with Microsoft.
    • Marked as answer by finalround Monday, January 28, 2013 12:02 PM
    Monday, January 28, 2013 12:02 PM

All replies

  • Hi,

    There have not been any changes to the way DPM 2012 backups up Exchange DAG's in SP1, so not sure about the timing.  The Events on the Exchange server do show VSS Writer issues, and DPM seems to be the victim of that.    Are you certain there were NO other changes on the Exchange server since moving to DPM 2012 Sp1 ?   Please work with the Exchange folks in their forum.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, January 22, 2013 1:29 AM
  • Thanks for the reply Mike.

    Prior to upgrading DPM 2012 to SP1, we had applied Exchange  2010 SP2 RU5-V2 but really didn't begin experiencing issues until the DPM update. After watching DPM late last week into the weekend, the retryable VSS errors were more frequently occurring with other protection groups that included file system, SQL and full VM backups as well so I still contend this was likely an issue with DPM 2012 SP1 and possible some instability introduced with the new agent. I decided to apply DPM 2012 SP1 CU1 yesterday and so far things seem to be stable. Since applying CU1 and updating the agents, I have only seen three recovery points fail with VSS transient errors.

    I opened a case with MS support for this and they had me run some dignostics prior to the upgrade to CU1. If anything shakes loose as a result of working that case I will post here. Otherwise I'm hopeful that CU1 was the magic bullet.

    Tuesday, January 22, 2013 12:23 PM
  • To close the loop on this, things have been running fine since my last post. MS support really didn't have much to add and agreed with Mike's point of view that there is nothing really in CU1 (at least that is published) that fixed anything related. In any event, I closed the ticket with Microsoft.
    • Marked as answer by finalround Monday, January 28, 2013 12:02 PM
    Monday, January 28, 2013 12:02 PM