none
DPM Bare Metal/ System System state fails ideas

    Question

  • Hello,

    Currently we are using DPM version 5.1.378 for doing our backups.

    My problem is that all of my Bare metal/System state backups fail... sometimes they work but most of the time they just fail...

    Is there any configuration that must be done for VSS Writer or wbadmin? Some of these errors I have also seen in previous versions of DPM. I guess BMR on DPM is very unreliable or is it just me?

    If anyone has run into problems or wants to try to help i can share details on all of the errors on all of my servers...

    Thank you.


    Friday, March 22, 2019 9:48 AM

All replies

  • Hi Bogdan,

    Backups tend to fail every now and then for various reasons, it is very common as there's a lot going on in an environment, which may cause backups to fail.

    There has been a lot of issues registered with DPM 1807 (5.1.378) when backing up with BMR (Bare Metal Recovery), but no real fixes have been found/provided.

    I know that DPM 2016 which is part of the LTSC (Long-Term Servicing Channel) has had some fixes regarding BMR backups in it's latest Update Rollup 6.

    Apart from that, as Microsoft announced that there will not be any more Semi-Annual Channel (SAC) for System Center, I would recommend to start using the LTSC version, either DPM 2016 or preferably DPM 2019.

    In your current situation, you could start by analyzing the DPM log on both the protected server and the DPM server for any more clues.

    If the failures happen often, try to find out if there's any pattern to when it's failing, then you could get closer on finding the root cause.


    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, March 22, 2019 10:28 AM
  • Hello again Leon,

    As you know we have discussed some issues regarding DPM Bare Metal/System state in a previous thread.

    I had DPM 2016 and had the same problems and upgraded to 5.1378 hoping that will help with something... unfortunately that didn't happen. I'm afraid that i will install 2019 and will have the same problems plus others on top of that...

    I can share a list of all the errors on both dpm server and backed-up servers maybe you know some of them and have an idea. :) 

    Would you recommend upgrading to DPM 2019? Are you using it? I haven't performed a downgrade before.. should i try that?

    The failures don't happen often, they happen every day. I have to rerun the backup and pray that it will not fail...

    Friday, March 22, 2019 11:56 AM
  • Yes you can post the errors here and I can try to help you as much as I can!

    I never want to "force" anyone to upgrade a software/hardware unless really needed, but there are some good improvements in DPM 2019 which I think most users would probably want to have.

    You could also consider setting up a test DPM 2019 on the side, and then test the BMR backup/backups to see how it works.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, March 22, 2019 12:33 PM
  • I'll try installing 2019 in a test environment.

    Till then here is one of the errors. 

    DPM:

    Backed-up Server

    From event viewer:

    Event 517, Backup

    The backup operation that started at '‎2019‎-‎03‎-‎22T12:48:57.358472900Z' has failed with following error code '0x8078015B' (Windows Backup encountered an error when accessing the remote shared folder. Please retry the operation after making sure that the remote shared folder is available and accessible.). Please review the event details for a solution, and then rerun the backup operation once the issue is resolved.

    From wbadmin:

    Windows backup encountered an error when accessing the remote shared folder. Please retry the operation after making sure that the remote shared folder is available and accessible. Detailed error: The semaphore timeout period has expired

    Backup of volume C: has failed. Windows Backup encountered an error when accessing the remote shared folder. Please retry the operation after making sure that the remote shared folder is available and accessible.

    What i have done:

    1. Did a manual bare metal to ensure that wbadmin is working

    2. Checked that the server has rights to the DPM share created

    3. Removed the backup and recreated (initial backup worked and after that it started failing)

    4. Increased allocation size to far more than what was needed (gave it 250gb)

    5. Gave unlimited space for shadow copy on drives that it was backing up during bare metal

    What i noticed:

    The backups don't just instantly fail... they run for a while transfer X amount of data and afterwards they just fail... 


    Friday, March 22, 2019 1:00 PM
  • Any ideas Leon? :)

    Thank you,

    Bogdan

    Thursday, April 4, 2019 2:54 PM
  • Sorry for the late answer, I don't have much to come with I'm afraid.

    You mentioned that the failures don't happen often, which means it happens either randomly or at similar times (some kind of pattern), I would try to monitor the situation to find out what could be causing this.

    What else is going on during the times this specific backup or backups are running.

    Since it seems WSB (Windows Server Backup) is having issues accessing the share somehow, you could try to manually create the share and check whether DPM can access it.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Sunday, April 14, 2019 9:49 PM
  • Hello Leon,

    The backups fail every day :) the behavior changed.

    I keep getting the semaphore period expired. I incline to think that the connection is cut off after some time.

    Have you gotten this error? I checked if the share exists/ is created on backup start and it is so i don't think that is the problem. Any idea?

    Monday, April 15, 2019 6:19 AM
  • Hi Bogdan,

    I have not received this error, but many others in the community have, I've seen many posts about this, although no proper solution has yet to be found.

    I'm aware of that you're using DPM 1807 which is a Semi-Annual Channel (SAC) release, I would strongly recommend starting to use the Long-Term Servicing Channel (LTSC) (i.e. DPM 2016/2019) as they will be able to receive upcoming fixes.

    Update Rollup 7 for DPM 2016 came out this week as well, but t didn't have any mention of a fix for this issue though, but it could also be "hidden" :-)


    Blog: https://thesystemcenterblog.com LinkedIn:


    • Edited by Leon Laude Friday, April 26, 2019 7:08 AM
    Friday, April 26, 2019 7:02 AM
  • This software is incredible... 

    How many years has it been out and it still has big issues with it...

    I ended up upgrading last time because of similar issues ... It seems like the bugs aren't fixed... I will try upgrading to Rollup 7.. if i can downgrade from my current version that is.

    Thank you Leon.

    Monday, May 6, 2019 6:42 AM
  • Hi Bogdan,

    I would go for the newer version, DPM 2019, although it seems that Update Rollup 7 for DPM 2016 has some fixes that are not yet in DPM 2019.

    DPM 2019 is scheduled to receive it's first Update Rollup 1 in Q3 2019.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, May 7, 2019 9:24 AM
  • Hello Leon,

    I'm upgrading DPM today and will provide feedback.

    I am following your step by step. 

    Link: https://thesystemcenterblog.com/2019/03/17/upgrading-to-dpm-2019-step-by-step/

    One thing you could add is:

    If the person upgrading also has Azure backup he needs to close these services as well.

    Also after stopping the specified services he should close the Microsoft Management Console or the upgrade won't go forward.

    Hope this helps.



    Wednesday, May 22, 2019 7:14 AM
  • Thanks for the feedback Bogdan, I will add it.

    Let us know how your BMR/System State goes with the new DPM.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, May 22, 2019 7:19 AM
  • Hello Leon,

    Unfortunately I have the same problem.

    Windows backup encountered an error when accessing the remote shared folder. Please retry the operation after making sure that the remote shared folder is available and accessible. The semaphore timeout period has expired.

    Any other ideas?

    Wednesday, May 22, 2019 7:39 AM
  • Do you have Windows Defender, or any other antivirus / firewall enabled?

    If so, could you try to completely disable all of them and then try backing up?

    Have you checked the permissions to the remote shared folder if DPM can access it?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, May 22, 2019 8:04 AM
  • Hello Leon,

    Windows Defender is disabled. We are using Bitdefender (which i have also disabled to check if it works like this).

    Unfortunately that doesn't seem to be the case.

    I have checked the folder permissions. When a backup is started from DPM there is a share that is automatically created and permissions to the specified server are granted... I guess there shouldn't be an issue on that.

    To top it off SQL and files backups are working without any problems on the servers that fail BareMetal/System state.

    I have ran out of ideas...

    Wednesday, May 22, 2019 8:21 AM
  • Disabling might not always be enough, I have both seen and heard that BitDefender can be quite rough and has been known to cause problems.

    Is it physical servers or virtual machines that you're trying to perform a BMR/System State backup on?

    Does it happen to all servers or just some? What operating system are they running? Are there any more differences?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, May 22, 2019 8:27 AM
  • Seems like one of the servers (that has antivirus disabled)has finished the backup successfully but the others with the same setup keep failing... the inconsistency is really confusing.

    All the servers are physical.

    Yes it did happen to all servers (up until now).

    OS: Windows Server 2016 Datacenter.

    I will try uninstalling the antivirus on one of the servers to see if the issue persists. I will keep you updated.


    Wednesday, May 22, 2019 8:48 AM
  • Hello Leon,

    After many months I have come to some results.

    Here is what I have done/tried.

    1. BMR/System state with Bitdefender disabled --> Result: backup failed

    2. BMR/System state with shadow copies enabled on where the OS is installed + the other partitions that Windows automatically creates --> Result: backup failed

    So what I did was do both things and to my surprise it worked... Finally all my backups worked.

    I need to test if they will continue working if i don't disable my antivirus because having the antivirus disabled defeats the purpose of the antivirus + contact Bitdefender if the issue does persist.

    I will keep monitoring and post my results if you are interested. Hope this helps other people that are facing this problem. Would you like me to mark this as answered or first post the results and after mark it?

    Thank you for all the help!

    Thursday, May 23, 2019 7:49 AM
  • Hi Bogdan,

    Thanks for being patient with this, I'm glad to hear that you have some progress on your issue!

    I'd be glad to hear about your results, you can mark as answered after the results to verify that it actually works.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, May 23, 2019 8:10 AM