Consistency Check on-going for 177 hours, why? RRS feed

  • Question

  • Currently, there is a protected file server in my company infrastructure running a Consistency Check (CC) for 177 hours*.  During this time I am not able to perform any Recovery Points (RP).  This is unacceptable to my client base.  They require twice daily RPs.  I have two questions: (1) can I solve the problem of long CC**, (2) is there some alternative way that I can get RPs while the CC is going on?  Some in my organization suggest using a different backup software such as Veeam, however our CTO would like to stay with a Microsoft based product.

    Here are some details of my configuration:

    1. Protected Server.  The replica size is 1.7 TB, Windows Server 2008 R2 SP1, 8 GB memory, Intel Xeon 2.67 GHz

    2. DPM Server.  DPM 2012 SP1 on Windows Server 2008 R2 SP1, Intel Xeon 2 GHz - 8 core (16 CPUs), 64 GB memory, Physical server, 74 TB total capacity on DPM server, Network link speed 1 Gbps, no throttling enabled, End point protection (EPP) installed on server yet DPMRA.exe, CSC.exe and SQLServr.exe are excluded processes.

    I'm confused :-(  Any help would be greatly appreciated.  Thank you.

    - Jenna

    * I read the other forum items regarding long CC checks.  They indicated that this is a known issue with DPM.

    ** The following link indicated that a Replica Creation (RC) for 500 GB on a 1 Gbps link should take less than 1.5 hours and that the CC check should have less network impact than RC. http://technet.microsoft.com/en-us/library/ff399619.aspx

    Tuesday, December 3, 2013 9:03 PM

All replies

  • Currently running a Consistency Check for 177 hours. 

    That does seem a little excessive.

    You can check to see whether it is progressing, or simply stalled.  The first spot to look at is on the monitoring, all jobs in progress tab, and look at the job, Transferred Data column, whether any data at all (measured in MB) has been transferred as part of the CC.  If it says zero, even after 177 hours, then clearly this should raise suspicions, so read on in the next section.

    Provided the DPM server OS is 2008 R2 or newer, fire up Resource Monitor and watch the disk tab. You should see files from the replica volume being read at a suitable speed.  (basically the speed the files are being read at is expected to be the lesser of either the protected server's maximum capable speed at reading those files, or the DPM server's maximum capable speed at reading the files).  On modern hardware you should be seeing over 100 megabytes per second. (on average hardware at 50MB/s, and on crummy hardware 20MB/s)

    If no files are being read at all, then you've got a problem that needs investigating.

    Doing the maths on your figures, reading 1.7TB in 177 hours, assuming it actually finished, it means one of the servers was only able to read the data at 2.8 megabytes per second, which is pathologically slow. I would certainly consider turning off real-time protection feature of the antivirus product to see if this makes a difference. Otherwise you've got some kind of storage performance problem.

    (or if no files are being read, some kind of DPM software problem.)

    1.7 TB at 20MB/s should be complete-able within 24 hours.
    • Edited by The Fellenator Tuesday, December 3, 2013 10:39 PM provided 24 hour estimate
    Tuesday, December 3, 2013 10:29 PM
  • Thank you for your reply and suggestions.

    I used Task Manager and Process Monitor (Sysinternals) to track the performance problem down to Endpoint Protection's (EPP) real-time scanning.  When I disabled the real-time scanning, the Consistency Check (CC) ran 5 times faster. 

    When I enabled real-time scanning, yet changed the EPP Real-time protection Settings of "Monitor file and program activity on your computer" from "Monitor all files" to "Monitor only incoming files" the CC ran 3.6 times faster.

    Would anyone know if I would be opening my network up to any problems by making this change to EPP?

    Friday, December 13, 2013 7:04 PM