none
Hyper-V 2012: Replication fails to automatically resume after a server reboot

    Question

  • Hello,

    I have a VM lab with Windows 2012 and Hyper-V. Overall it has been awesome, but I have one small glitch:

    The VMs on server A replicate to server B. The new Hyper-V replication feature works perfectly, except for my one problem. When I shut down and then restart the servers, replication does not always automatically resume for all VMs. Some VMs will resume replication on their own and some will require a manual resume - and it seems to be random. I can always restart replication by right clicking and the hitting Replication --> Resume Replication, but I rather hoped that this would happen automatically, all of the time.

    Has anyone else experienced this? Is there a cure?

    Thursday, March 14, 2013 3:56 PM

Answers

  • Hi,

    Resume Replication, available only if replication has been paused for a virtual machine,

    This action resumes replication for the selected virtual machine (the action must be executed in the same site where replication was Paused).  The Hyper-V Replica Network Services component re-establishes a connection to the Replica server (if needed) and replication resumes.  If the virtual machine was in a Resynch Required state, Resume Replication performs a resynchronization.  A resynchronization essentially compares blocks between the Primary and Replica VHDs and then sends the delta blocks to the Replica. Scenarios where this can happen include, but may not be limited to, a failure occurred on the Primary server when changes were being made to the replication log or, if the Primary is a Failover Cluster, an unplanned cluster failover occurred.  The Replication Health column in Hyper-V Manager interface (if selected for display) reflects a status of Normal

    > when I shut down and then restart the servers, replication does not always automatically resume for all VMs.

    Shutdown primary server A or shutdown both Hyper-V servers?

    Since the default virtual machine Delta Replication happens every 5 minutes, each VM may be in different status, which may be determine whether to pause replication when Hyper-V host shutdown. (While, just my view of point, I don’t test that.)

    For more information please refer to following MS articles:

    Understand and Troubleshoot Hyper-V Replica‎ 
    http://download.microsoft.com/download/F/F/1/FF1FA6DE-E82A-48EF-BDCC-612C2D588BFE/Understand%20and%20Troubleshoot%20Guide%20Hyper-V%20Replica%20in%20Windows%20Server%208%20Beta%20.docx
    Interpreting Replication Health
    http://blogs.technet.com/b/virtualization/archive/2012/06/15/interpreting-replication-health-part-1.aspx


    Lawrence

    TechNet Community Support

    Friday, March 15, 2013 6:40 AM
    Moderator

All replies

  • Hi,

    Resume Replication, available only if replication has been paused for a virtual machine,

    This action resumes replication for the selected virtual machine (the action must be executed in the same site where replication was Paused).  The Hyper-V Replica Network Services component re-establishes a connection to the Replica server (if needed) and replication resumes.  If the virtual machine was in a Resynch Required state, Resume Replication performs a resynchronization.  A resynchronization essentially compares blocks between the Primary and Replica VHDs and then sends the delta blocks to the Replica. Scenarios where this can happen include, but may not be limited to, a failure occurred on the Primary server when changes were being made to the replication log or, if the Primary is a Failover Cluster, an unplanned cluster failover occurred.  The Replication Health column in Hyper-V Manager interface (if selected for display) reflects a status of Normal

    > when I shut down and then restart the servers, replication does not always automatically resume for all VMs.

    Shutdown primary server A or shutdown both Hyper-V servers?

    Since the default virtual machine Delta Replication happens every 5 minutes, each VM may be in different status, which may be determine whether to pause replication when Hyper-V host shutdown. (While, just my view of point, I don’t test that.)

    For more information please refer to following MS articles:

    Understand and Troubleshoot Hyper-V Replica‎ 
    http://download.microsoft.com/download/F/F/1/FF1FA6DE-E82A-48EF-BDCC-612C2D588BFE/Understand%20and%20Troubleshoot%20Guide%20Hyper-V%20Replica%20in%20Windows%20Server%208%20Beta%20.docx
    Interpreting Replication Health
    http://blogs.technet.com/b/virtualization/archive/2012/06/15/interpreting-replication-health-part-1.aspx


    Lawrence

    TechNet Community Support

    Friday, March 15, 2013 6:40 AM
    Moderator
  • I wish I could downvote this answer.
    Friday, July 19, 2013 7:45 AM
  • Did you ever find an answer to this? I've been having the same issue. Occurs if we have updates on the Replica Server or if there is a line outage like we just had. I have to go through all the replicating servers looking for a Replication State of Replication error. Then select Replication -> Replication Resume for each one that is showing error. I realize that they will be in different states when this happens but why can't it be setup to automatically try to do a resume if there is an error. At least try every so often to resume so that once the Replica Server comes back up/available it would start up again without my having to manually do the check.

    Would there be a power shell script that could be run every so often to do this? I'm still not that familiar with power shell and hyper-v.

    Thanks,

    Richard

    Wednesday, August 7, 2013 5:43 PM
  • I'm also having the same issue. Replica status will stay normal for about a day and then changes to critical.

    When I look at the error it's always this - Hyper-V could not replicate changes for virtual machine 'J2TH2012': The device does not recognize the command. (0x80070016). 

    It's preceded by this - Hyper-V suspended replication for virtual machine 'J2THB2012' due to a non-recoverable failure.

    Searching on the net garners very little results.

    I found when I tested it in a non-clustered environment it worked well. In a FOC environment it's pretty flakey, which defeats the purpose.

    The other thing I found was that all the how-to guides show the initial copy over the network method which always works. Yet I had zero success in seeding from a copy or via a USB drive on the endpoint.

    Saturday, August 17, 2013 8:26 PM
  • I'm also having the same issue. Server A is configured to replicated some VMs to Server B and Server B is configured to replicated some other machines to Server A. Both Servers restarted this night because of a windows Update at the same time. After the reboot replication was stopped and replication health went critical after a few hours.

    I expect replication to be automatically paused for and automatically resumed after a graceful reboot. So the answer above is no answer for me too. To be forced to add some triggered tasks to achieve this is no option and shows how buggy this behaviour is...

    Could please anybody bring this issue forward to a senior Microsoft engineer?

    Tuesday, September 17, 2013 7:32 AM
  • I have experienced the same problems mentioned above. I feel like replication stops even without a server reboot (host or guest). It just seems to stop for no apparent reason. The VMs are being replicated from a failover cluster to a single host Hyper-V server. 

    Anyone find a fix?

    Thanks,

    Mike

    Friday, October 18, 2013 7:40 PM
  • I'm experiencing this as well. I have to reset statistics and resume replication.

    When it does fail, usually I have to shutdown the affected VM, then do reset stats and resume replication. The reset stats I usually have to do it on both primary and replica server.

    That long answer that guy posted above, it may seem that we have to pause it before we do a shutdown/restart. I think this may be by design, because MSFT wants you to buy the system center. There's some nifty features in it, but for small users base, it just might be overkill.

    Tuesday, April 8, 2014 4:07 PM
  • You're right, you have to pause replication. The tricky thing is: Server A replicates to server B. Server B wants to restart. So Server B has to pause replication on (!) Server A for each VM replicating from A to B. Normally this could be easily done by a powershell script. Server B is able to remotely suspend replication from A to B using "suspendVMreplication -computerName ServerA".

    BUT: I never got it working as a shutdown powershell script. It seems that in the moment the shutdown script on Server B is executed no access to Server A is possible. Possibly network is already down or something similar. Executing the script before a manual shutdown works of course.

    Any ideas how to manage this?


    Tuesday, April 8, 2014 9:49 PM
  • Has anyone found a real solution to this?  It can't be that I have to manually resume replication every time a host reboots.   Putting the VM's in a paused state is not a solution since those are also windows OS's that need patches and reboots too.  

    I noticed the troubleshooting document that Lawrence posted a link to is for "windows server 8 beta".    Both my hosts are Server 2012 R2, which surely this issue has been addressed right?  

    Any help is greatly appreciated.


    Mattythefatty

    Thursday, December 11, 2014 3:31 PM
  • The only solution I have found is PowerShell script that works well at resuming replication post reboot. 

    http://community.spiceworks.com/scripts/show/2565-resume-vmreplication

    Scripts at own risk etc. 


    • Proposed as answer by mooncalf Tuesday, December 16, 2014 10:54 PM
    Tuesday, December 16, 2014 8:06 PM
  • After dealing with this issue for years and searching Google, this is the exact type of answer I did not want to see at the top of the search results with no real resolution.  This is the problem I have with Microsoft.  They seem to think it's OK to say all of us are doing it wrong.

    Shutdown both servers?  That's the most ludicrous answer I have heard for this.  What if we are having issues on a reboot because we are rebooting, not shutting down?  Like all Microsoft operating systems, they need to be rebooted.  As such, we are rebooting the servers to keep them running at peak performance because the OS can't clean up after itself and keep itself running well without a reboot.  It's during that Microsoft-mandated reboot that this issue occurs.  

    We used to reboot our Hyper-V servers monthly, but were running into some weird issues and didn't want to take the time to diagnose each one when a reboot would fix the issue.  Now they reboot weekly, and every single Monday, without fail, I will come into at least 2 VMs that have paused replication.

    After this long, I would have expected this bug to be fixed.  Yes this is a bug.  If software is created that can't handle a LAN outage gracefully, that is a bug.  It's even worse if that software is what we use to run our companies.  I'm so glad the company I work for doesn't have much going on during the weekends.  I pity the companies that use Hyper-V for servers that are busy all the time.  I guess that's why they run VMWare.

    Monday, August 29, 2016 12:56 PM
  • Well said.

    One more thing we have to check every week after reboots!!

    I will add I find it fascinating that essentials reporting didn't tell me my replication hasn't taken place in 6 days!!

    I only noticed because of low disk space due to the replication "temp" file.

    Lawrence

    Wednesday, October 19, 2016 1:56 PM