none
D2D2T backup fails on dedup folume due to curroption on replica (ID 2036 Details: The file or directory is corrupted and unreadable (0x80070570)) RRS feed

  • Question

  • Hi All,

    When I am running a tape job for a dedup file server, the backup fails with following error

    Description:        The Operation failed since DPM was unable to access the file/folder \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy1505\197d1387-eb5a-4a87-a459-06d757b2d875\Full\DATA\General\Knowles Limited\95864 Primeplace Limited v Knowles Limited\Letters\Thumbs.db on  UKDAR1BK011.hillintl.com. This could be due to the file/folder being corrupt or the file system on the volume being corrupt. (ID 2036 Details: The file or directory is corrupted and unreadable (0x80070570))

                  

    The actual file on file server is accessible. what should be the next action plan.

    Thursday, January 7, 2016 5:08 PM

All replies

  • Hello, I'm having the same issue.

    Protected server: 2 Physicals Windows 2012 R2 node, DELL Compellent SAN, 16TB clustered deduped volume (8K NTFS cluster size) and SCEP R2 (version 4.8.204.0) and dpm agent 2012 R2 UR8

    Physical Primary DPM server: Windows 2012 R2 with DPM 2012 R2 UR8 with SCEP R2 (version 4.8.204.0)

    DPMRA.exe process is excluded in SCEP

    I used to do backups to tapes without any issue since long, now since we moved to D2D backup we are encountering the corruption issues on the recovery point volume.

    Symptoms are: The Initial replica creation and subsequent recovery point jobs succeed without error, but when trying to restore the Thumbs.db file from the primary DPM server it fails on thumbs.db file with error (0x80070570). Because of this error I'm not able to protect the volume to a secondary DPM server (the initial replication fails)

    on the source protected server there is no corruption and i can access and copy the affected thumbs.db file.

    Interrestingly I've noticed from the primary DPM I'm able to restore the first version of this affected thumbs.db file backed up during the initial replication but get the error when trying to restore the next versions of the file. So for me, it's seems that the corruption occured in the replica volume after a sync.

    This corruption doesn't occur only on some thumbs.db files but also on other file types like .xls

    Our physical DPM server has been replaced by a brand new standalone one with internal disks and still having this issue.

    Any help would be greatly appreciated.


    • Edited by Larry.B.IT Wednesday, January 13, 2016 11:48 PM
    Wednesday, January 13, 2016 9:50 AM
  • I also am having the same issue.

    This issue has existed since the introduction of support of backup of 2012 R2 deduplicated volumes.

    The issue persisted through every cumulative update of DPM 2012 R2, and is still present in v4.2.1373

    My experience of the issue isn't specific to "thumbs.db" files, but appears to be more generalized file system corruption caused by possibly incorrect layering of the filter drivers involved or similar defect.

    Specifically, the changed block tracking (CBT) filter driver of DPM itself, the de-dupe filter driver itself, as well as the volume shadowcopy services copy-on-change filter driver appear to be inter-working incorrectly.

    Specifically, during a consistency checl or initial sync, the DPMRA agent reads the deduplicated volume in deduplicated state, and synchronises these files to the replica.  At this point the replica is OK, matches the source, and tapes just fine.

    During a subsequent synchronisation, the CBT filter driver enables the DPMRA to send only changed blocks from the source server to the replica.

    However, due to a defect, likely in the VSS filter driver, the recovery point volume gets corrupted at this point.

    The subsequent tape backup then fails due to filesystem corruption.

    It's reached the point that I want to log this with Microsoft, but they'll just tell us to format all disks and re-install windows on each of our file server and DPM servers and then ask use to reproduce the issue (with no data to backup), which we won't be able to do, except we've now lost all our data while following MS instructions.

    Thursday, February 18, 2016 12:12 AM
  • Hello, I agree with you I have the same hypothesis. I'm using exclusively MS products on my file and backup servers to avoid conflict with 3rd party drivers. For me the altitude of filter drivers should not be involded as they have been chosen by MS.

    my further hypothesis is that the issue involves dedup, DPM filter, and files that are open by users with an exclusive lock. 

    Thumbs.db being database files like .accdb files this could explain why the issue occurs on them such often, because these are often exclusively locked for a longer period of time vs general files and there is more chance that a volume snapshot is created when these exclusive locks are existing.

    An example I can encounter sometime is with thumbs.db, those file can be left exclusively locked by users  without reason (Windows 7 explorer bug ?) for a period of time until the server detects and remove the lock.

    thumbs.db or .accdb are databases file running on user computers and not the server. There is no VSS Writer for these files so when DPM request a volume snapshot, they can only be crash consistent. The content can be corrupted but from my understanding the file "itself" on the DPM replica should still be accessible and should not be corrupted on the file system (0x80070570). And it seems the case for the local shadow storage and previous version of files (enabled locally on the server), I've never had a problem to restore.

    Before doing disk backup, we used to do tape backup only of the same protected server and never had this corruption issue. In case of tape backup all data are transfered to DPM in non optimized format so block change tracking is not used.

    To convice MS there is a bug somewhere between DPM and Dedup it will interesting if i can reproduce the DPM issue with a file on very small deduped test volume by keeping open an exclusive lock on the test file and let DPM doing sync jobs.

    Friday, February 19, 2016 8:03 PM
  • I did a script that simulates client opening files with exclusive locks, i did not manage to reproduce the corruption issue in DPM. However my tests has not been done on a clustered volume. So I suspect the bug to be also related to CSVFS layer on top of NTFS

    Anyway I've open a MS case and they have answered me the following:

    "Based on researches in our internal databases, it seems to be related to a known issue. We have an internal bug request for Azure  backup fails due to corruption on file thumbs.db on a deduped volume."

    So it's rather a good new and seems they 've admited there is a bug. Hope it will be fixed for the next DPM Update Release

    Wednesday, February 24, 2016 9:25 PM
  • I've stuggled lot of time and happily found the very specific conditions that make those corruptions to occur.

    I had to use a C++ I've given to MS and they are now able to reproduce the issue.

    They are now actively investigating it but it's quite complex and will take more time to have a resolution

    Sunday, June 19, 2016 6:40 PM
  • Got final conclusion from MS:

    [SCR]

    Symptom :

    You protect a file server with DPM 2012 R2. Deduplication is active on the DPM server.

    You have corruption on some protected files such as thumbs.db on DPM server side

    Cause :

    Conflict between the dedup filter and the dpm filter because of the difference in granularity in monitoring the files/bytes changes

    This is a by design issue, Product Groups for both deduplication and DPM are still investigating for a long-term solution to resolve this bug

    Resolution :

    As a workaround, you disabled deduplication on the DPM server and added more storage to host all the required data

    You may also exclude the impacted file from dedup and optimization using the Set-DedupVolume commandlet with parameter -ExcludeFileType

    TechNet – Set-DedupVolume

    https://technet.microsoft.com/en-us/library/hh848438(v=wps.630).aspx

    -ExcludeFileType<String[]>

    Specifies an   array of extension types that the deduplication engine excludes from data   deduplication and optimization. Specify comma-separated values that are not   preceded with a period (.). When you change this setting, you override the   existing values.

    • Proposed as answer by Larry.B.IT Wednesday, October 26, 2016 10:24 AM
    Wednesday, October 26, 2016 10:24 AM
  • Hello,

    good new, the issue has been finally fixed in UR12!

    From my tests, i can confirmed i can't reproduce the issue after i've installed the UR12

    • Proposed as answer by Larry.B.IT Sunday, April 2, 2017 10:11 PM
    Sunday, April 2, 2017 10:11 PM
  • From my tests

    can anyone else confirm this? since nothing de-duplication related is mentioned in the list of fixed issues for UR12

    This is becaue the issue sometimes appears to be gone for the first backup done after a consistency check, but then continues failing again a few days later.

    Monday, April 3, 2017 4:04 AM
  • Hello, 

    I got confirmation from MS that "Azure backups and restores fail when the data source contains files such as thumbs.db" mentionned in the list of fixed issue in UR12 should has fixed the issue regarding the case i've open.

    I think you have still issues because corrupted files have been existed on your DPM replica(deduped) prior you update to UR12. Unfortunately from my tests, it seems that the concistency check do not detect and fix this particular corruption.

    Two solution: rebuild from scratch your DPM replica (you lost your recovery points)

    or (may not be supported by MS)

    suggest you just to only delete the corrupted files on the rDPM eplica and then perform a CC. In this situation files will be rewritten by CC, fixing the corruption.

    If you're interested i can provide you a powershell script to identify all the corrupted files on the DPM replica.


    • Edited by Larry.B.IT Wednesday, April 5, 2017 9:45 PM typo
    Wednesday, April 5, 2017 9:43 PM
  • or Third solution to test without altering the DPM replica directly:

    - use the script to identify the corrupted files on DPM replica (read only operation)

    - from the protected source server, delete them (create a copy before if needed) and perform or wait for the next sync

    Wednesday, April 5, 2017 9:54 PM
  • I confirm, on my production environment (15TB of deduped data on a disk) i have no more corruption on the primary dpm replica (optimized backup) since UR12. I'm now also able to synchronize my secondary DPM server from the primary DPM replica and perform long term backup jobs on tapes without encountering corruption errors.

    this has been fixed since UR12, it's mentionned in the list of UR12 fixes as "Azure backups and restores fail when the data source contains files such as thumbs.db."

    • Proposed as answer by Larry.B.IT Friday, September 15, 2017 6:47 PM
    Friday, September 15, 2017 6:47 PM