none
DPM 2016 UR4 Windows Backup encountered an error when accessing the remote shared folder RRS feed

  • General discussion

  • Hi,

    I'm running into this error with around 50% of my 65 servers during a system state/bare metal backup. The volumes back up fine. The error happens roughly 10 minutes into a system state/BMB:

    Error code '0x8078015B' (Windows Backup encountered an error when accessing the remote shared folder. Please retry the operation after making sure that the remote shared folder is available and accessible.)

    I'm also running into the semaphore timeout errors too, it's probably related.

    We are running DPM 2016 UR4 with SQL 2016 SP1 on Windows Server 2016. It has a 52TB RAID60 data drive formatted with ReFS. We run all backups to the disk, no tape.

    I have the recovery points for the protection groups spread out during the day to reduce load on the server. The groups are set to sync every hour, with a 10 minute offset between them.

    We are backing up Server 2008/2012/2016, and I noticed that no 2008 servers are having errors.

    I'm simply at a loss at this point, I've tried every solution I could find on technet and google. I've tried everything from both this error and the semaphore error.

    Has anyone else ran into this issue and resolved it? On a side note, I set up an almost identical server for one of our clients, and they have no issues with their DPM server. But they are also backing up maybe 30 servers.


    Tuesday, December 12, 2017 5:36 PM

All replies

  • Run into it, but not resolved or escalated it. Looks like a bug with DPM 2016 UR and BMR backups.

    We only took a BMR backup from one physical domain controller, i have removed this protection for now. We have backups from other domain controllers (VMs - Hyper-V) so i can do without BMR for this server.

    Wednesday, December 13, 2017 6:38 PM
  • I have had this problem since I switched to DPM 2016 at the UR1 release.  I have also spent much money on this issue, not only in my personal time, but in premier support case hours to get this solved (I had a premier support case open virtually a year on this).  I have had NO success from Microsoft on this problem.  They keep saying they have not seen this problem.  I firmly believe Microsoft believes there is no problem here.  I guess we are just imagining this issue.

    I found for me it got much better once I changed my DPM server from a VM with VHDX files for the data storage to a physical server with direct access to the volumes.  I did this on my own, not at Microsoft's suggestion.  However, over the last few weeks with UR4, it has seemed to start becoming a problem again.  Not as bad as before when I NEVER could get a BMR to work for any server.  Today I have to manually restart the BMR backups for some systems each day, and usually after several attempts I can get it to work.  To me this is still a major problem with DPM 2016 even at UR4.

    I would like DPM to be a system that I do not have to worry about on a daily basis.  Unfortunately, that is not the case any longer.

    This was not a problem in DPM 2012 R2 (or earlier versions).


    • Edited by simdoc Monday, February 5, 2018 5:43 PM
    Monday, February 5, 2018 5:42 PM
  • After fighting with this issue for the last few weeks, it would seem that a simple setting improves the situation quite significantly. It was suggested on one of the other threads that the issue is related to ReFS and some of the recent windows patches included tunable parameters.

    https://support.microsoft.com/en-us/help/4035951/refs-volume-using-dpm-becomes-unresponsive-on-windows-server-2016

    Changing the timeout value (option 4) from 60s to 180s has resolved the issue on our server. Remember to reboot the DPM server after changing the value.

    Friday, February 9, 2018 5:31 AM
  • gslavov222:

    Did you do anything else?  That is not working for me.  I assume you set this on the DPM server, not the protected system?  Did you make any of the other indicated changes or anything is this document:

    https://support.microsoft.com/en-us/help/4016173/fix-heavy-memory-usage-in-refs-on-windows-server-2016-and-windows-10

    Thanks.

    Friday, February 16, 2018 2:16 PM
  • This is also happening to us on our primary server - all of a sudden numerous (not all) system state jobs fails. The secondary server as of right now is fine. 

    It complains about numerous things: the remote share is inaccessible - not enough space for the replica - both of which are false. So I suspect too like others here that it is a bug ultimately triggered by something for this to happen and for certain VMs.  


    Friday, February 16, 2018 9:17 PM
  • The secondary is a whole different concept.  It doesn't use Windows Server Backup.  Too much silence from Microsoft on this issue in my opinion.

    Friday, February 16, 2018 10:18 PM
  • No, I had a look at the rest of the settings from that link but none looked like they would make a difference. The only setting that I changed was the timeout. In saying that, I did add another 64Gb of ram to that server but it doesn't seem to matter at its memory usage has remained roughly the same (and the memory was added prior to changing the timeout value and didn't seem to make a difference)

    Tuesday, February 20, 2018 5:55 AM
  • I have a solution that sometimes works.

    So this is something to try. It seemingly has nothing to do with the
    error, but try it and bear with me.On the protected server right click
    the C:\ drive and select configure shadow copies. You'll notice you have
    a funky volume (recovery volume) that looks like this:

    Select
    it and then hit the settings button. Use the schedule button to set the
    schedule you want. But the important part is to set which drive it
    stores the shadow copies on. I set it to c:

    Set a size limit you want but give yourself breathing room. The
    backup process needs to image the partition and then copy it to the DPM
    server. So I would recommend allocating at least 100% of the total
    partition size. It's small so this shouldn't be a problem.

    Once you do that manually create a shadow copy.

    Then retry the BMR backup job and see if it completes successfully. Rinse and repeat on the other servers giving you grief.




    • Edited by ZackInMass Monday, September 10, 2018 4:08 PM
    Monday, September 10, 2018 4:04 PM
  • I am having this issue and just found the UR6. A few things caught my eye. 

    https://support.microsoft.com/en-nz/help/4456327/update-rollup-6-for-system-center-2016-data-protection-manager

    • Data Protection Manager System State and Bare Metal Recovery (BMR) fails intermittently for multiple servers.
    • Indefinite looping occurs for consistency check jobs for Resilient Change Tracking (RCT) on a virtual machine (VM) because of I/O errors. 

    Installing now. Hopefully this fixes the issues. 


    • Edited by JustusIV Tuesday, October 23, 2018 8:21 PM
    Tuesday, October 23, 2018 8:20 PM
  • Hi JustusIV,

    Did UR6 seem to help?

    Monday, October 29, 2018 6:51 PM
  • @david

    A bit earlier for me to really know.... however the UR6 wasn't a magic fix. I was still having issues until i Deleted and re-created the protection of the servers, since i have done that i have not had issues. 

    Monday, October 29, 2018 7:21 PM
  • hi , I  have a same problem. UR6 didn't help :(
    Monday, November 26, 2018 11:01 AM
  • We also see the same issue. Haven`t tried upgrading to UR6 yet, but the workaround in this thread seems to work.

    Temporarily disable WinRE when taking a system image backup

    Note WinRE will be disabled in the backup image. Therefore, you have to enable it after you do a system recovery from this image.
    1. Right-click the Start button, and then click Command Prompt (Admin) to open an elevated Command Prompt window.
    2. Disable WinRE by running the following command:
      reagentc /disable 
    3. Run the system image backup function.
    4. Enable WinRE by running the following command:
      reagentc /enable 


    Jan Egil Ring

    Blog: http://www.powershell.no
    Twitter: http://twitter.com/janegilring


    Wednesday, December 19, 2018 8:17 AM
  • this action did not help :(

    Friday, January 18, 2019 10:10 AM
  • What i did and worked for one server (got 3 more to try).

    1. Did a local backup from command prompt:

    wbadmin.exe start backup -allcritical -backuptarget:e:\

    2. Created a share on the dpm disk and did a backup to the disk where all the dpm backups are stored (initially it failed and the second try it worked)

    wbadmin.exe start backup -allcritical -backuptarget:\\server\share

    3. Followed Jan's advice and disabled WinRE (from Command Prompt (Admin)):

    reagentc /disable

    Please tell me if that worked for you

    Tuesday, January 22, 2019 1:04 PM
  • Same problem here. DPM 2016 RU6.

    Server backuped: windows 2016

    BMR fails intermittenly.

    Thursday, February 14, 2019 11:44 AM

  • We have recently upgraded much of our Windows estate to Windows Server 2016 and as part of this work we also upgraded our DPM Server to DPM 2016 1801. Since doing so, the BMR/System State backups for Windows 2016 servers frequently fail. The BMR backups on our Windows 2008R2 servers continued to work without any problems. Looking through the backup/event logs we see the following logged when the backups fail:

    0x8078015B  Windows Backup encountered an error when accessing the remote shared folder
    0x80070079 The semaphore timeout period has expired

    I spent quite some time trying to find a solution to this problem, initially without a great deal of success. If I removed the server from the protection group in DPM, cleared the backup history, and then re-added the server, I would find that the BMR backups worked once or twice before failing consistently once again. If I ran Windows backup on the protected computer, backing up the BMR to local disk it would also work fine. The same BMR backup to a share on the DPM server also worked consistently. Further digging revealed that the DPM server presents a share to the protected computer for the duration of the backup. Testing revealed that the share was removed from the DPM server before the BMR backup completes, resulting in the backup failing. I attempted various configuration changes to try and resolve the problem, but none of these worked. For completeness, here is what I tried unsuccessfully:

    1. Creating/modifying the following registry keys on both the DPM server and the protected server
    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\SPP\CreateTimeout and set decimal value to 3600000. To increase VSS timeout.
    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\VSS\Settings\IdleTimeout and set decimal value to 3600000. To increase VSS timeout.
    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue and changed decimal value from 60 to 180.
    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\TcpMaxDataRetransmissions with a decimal value of 5.
    2. Running “wbadmin delete catalog” on the protected server to clear the backup catalog.
    3. Checking for shadow copies and removing them on the protected server using the command “vssadmin list shadows”.
    4. Increasing the paging file to 16Gb from 2Gb on the DPM server.
    5. Attempting to throttle the bandwidth using the bandwidth throttling in DPM.
    6. Modifying the disk allocation to 100Gb for the backup within DPM.
    7. Running chkdsk /f on the protected server.
    8. Modifying the maximum size of storage allocated to volumes for shadow copies to have no limit at all.

    As I said, none of the above helped at all. What has finally led to us being consistently able to run BMR backups reliably on Windows 2016 servers was throttling the bandwidth between the protected server and the DPM server using a QoS profile on the protected server itself. I’ve read somewhere that the throttling built into DPM does not apply to BMR backups. I’m not sure whether that’s true, but trying that did not help us at all. It’s very simple to create a QoS profile in Windows 2016, and it’s not necessary to install any additional software, roles or features. To do so, simply carry out the following steps:

    1. Start “Local Group Policy Editor” by typing gpedit.msc on the protected computer.
    2. Browse to Local Computer Policy > Computer Configuration > Windows Settings > Policy-based QoS.
    3. Right click and select “Create New Policy”.
    4. Provide a name for the policy, uncheck “Specify DSCP Value” and set an outbound throttle rate (I used 100MBps successfully).
    5. Select “All applications” (though if your backup server provides other functions you may wish to restrict this just to the Windows Backup application.
    6. Select “Any source IP address” and enter the IP address of the backup server in the destination field.
    7. Select both the TCP and UDP protocols, and any source/destination port.
    8. The policy takes immediate effect without a requirement to reboot.

    Since applying this policy our BMR backups on Windows 2016 have worked reliably and consistently. Hopefully this is of help to others as well, as this can be quite a frustrating problem!

    Simon Edwins
    Senior IT Security Specialist
    Rothamsted Research

    Thursday, June 6, 2019 2:56 PM
  • i can seem to get this throttle policy to actually throttle anything. i set everything to "any" on my hyperv host with a 102400 Throttle Rate but am still hitting gigabit speeds on everything.

    • Edited by JustusIV Friday, June 7, 2019 3:36 PM
    Friday, June 7, 2019 3:34 PM
  • Keep in mind that the throttle policy speed is defined in kilobyte/megabyte, and not kilobit/megabit. 102400 throttle rate thus is 800 Mbit/s (close to gigabit).
    Friday, July 19, 2019 2:10 PM
  • @ebootsma

    Not sure how i missed it but i did have that set incorrectly and the throttling appears to be working now.

    Now to see if it actually fixes the backups that never work.

    Thanks

    Monday, July 22, 2019 2:56 PM
  • Did you have any traction with this?

    We are having the same issue in DPM2019 - I was going to roll-back to DPM2016 however this thread suggests that that's not going to fix any problems for me, basically leaving us all in a scenario where DPM in its current state is not a fit for purpose backup tool for our server 2016 worklaods? 

    From what I understand DPM1801 seemed to work reasonably well for us before we upgraded to 2019 as a result of 1801 going End of Support. 

    I had found an article suggesting that DPM2019 UR1 would fix these issues which is due out "Q3 2019" - basically any time now, but I can't find anything recent on this. 

    Cheers,

    Daniel. 

    Thursday, August 8, 2019 4:03 AM
  • This problem being fixed in the update?  I wouldn't count on it.  It's been giving me problems since the earliest releases of DPM 2016 with little acknowledgement that the problem actually exists by Microsoft.  Can you point to anything that indicates to the contrary?  This has been going on for years now.

    I'm using DPM 2019 as well with this issue still present.

    Thursday, August 8, 2019 2:03 PM
  • What I've discovered in regards to this issue is it's due to the write caching in memory hitting it's limit and then flushing to disk.  While it flushes, all network IO stops for that BMR, and if it takes longer than the standard SMB timeout of 45-60 seconds to flush it all to disk, Windows Backup (BMR) will fail, which results in this error.

    So yes, the "fix" referenced here is the current workaround:

    https://social.technet.microsoft.com/Forums/en-US/home?forum=dataprotectionmanager&announcementId=2f4956d6-bc6c-4618-b8cf-682b9c0cb350

    Add the registry key below on the protected server

    reg add HKLM\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters /v SessTimeout  /t REG_DWORD /d 1200

    Note:  The 1200 second setting is a 20 minute timeout.  This is not a rigid number but has been tested with success.   Depending on latency you can adjust as needed.  To get suspected latency please review REFS event logs on DPM\MABS server.
    Stop/Start      Lanmanworkstation service

    Increasing that time out on the server's being backed up to say 5 to 20 minutes is the only current option other than faster disks.  The 3rd option would be disabling caching (it's already disabled on the REFS volume, but not on the VHDx that DPM mounts), but as the VHDx are unmounted/remounted by DPM, DPM would have to mount them without enabling cache... which may result in slower/faster backups or other issues.

    Tuesday, August 13, 2019 7:11 PM
  • These keys were added few month ago to the registry but they did not help much. The problem is not solved
    • Edited by Aleks__ Wednesday, August 14, 2019 10:49 AM
    Wednesday, August 14, 2019 10:48 AM
  • This may be worth a look. MAtches the symptoms EXACTLY!

    https://support.microsoft.com/en-us/help/4512842/bmr-backup-fails-in-data-protection-manager-or-azure-backup-server


    • Edited by ZackInMass Wednesday, August 21, 2019 4:19 PM
    Wednesday, August 21, 2019 4:18 PM
  • it does not help. These keys were added few month ago to the registry but they did not help much. The problem is not solved
    Tuesday, August 27, 2019 7:45 AM