none
Hyper-V 2012: Replication fails to automatically resume after a server reboot/line outage

    Question

  • I see where someone asked a similar question on this back in March but I don't really see where the question was really answered.

    I've been having the same issue. Occurs if we have updates on the Replica Server or if there is a line outage like we just had. I have to go through all the replicating servers looking for a Replication State of Replication error. Then select Replication -> Replication Resume for each one that is showing error. I realize that they will be in different states when this happens but why can't it be setup to automatically try to do a resume if there is an error. At least try every so often to resume so that once the Replica Server comes back up/available it would start up again without my having to manually do the check.

    Would there be a power shell script that could be run every so often to do this? That way if there is an outage after hours it would attempt to resume the ones with error.

    Thanks,

    Richard

    Wednesday, August 07, 2013 6:14 PM

Answers

  • The product does have inbuilt semantics to retry for network failures. We have an exponential back-off algorithm which starts retrying at a 1min interval and eventually plateaus at 30mins. If the product did not retry (in the worst case) 30mins after the network link was restored, then it sounds like we have a problem. The event viewer message under the Hyper-V -> Admin node will have a trace for the retry semantics which is being attempted.

     Praveen

    • Marked as answer by Richard13 Saturday, August 24, 2013 7:36 PM
    Thursday, August 08, 2013 5:43 AM
  • Sorry I've been out of the office. I did finally find the the logs under Hyper-V-VMMS -> Admin where it shows it was doing a retry on one of the VMs. So maybe I'm just not giving it enough time to sync back up. We will probably be rebooting the replica server later this week or first of next week if there are updates so I'll watch this log and see what happens and update then.

    Thanks,

    Richard

    Monday, August 12, 2013 5:40 PM

All replies

  • The product does have inbuilt semantics to retry for network failures. We have an exponential back-off algorithm which starts retrying at a 1min interval and eventually plateaus at 30mins. If the product did not retry (in the worst case) 30mins after the network link was restored, then it sounds like we have a problem. The event viewer message under the Hyper-V -> Admin node will have a trace for the retry semantics which is being attempted.

     Praveen

    • Marked as answer by Richard13 Saturday, August 24, 2013 7:36 PM
    Thursday, August 08, 2013 5:43 AM
  • Any update to the current situation?

    Best Regards
    Jeremy Wu

    Sunday, August 11, 2013 2:27 PM
    Moderator
  • Sorry I've been out of the office. I did finally find the the logs under Hyper-V-VMMS -> Admin where it shows it was doing a retry on one of the VMs. So maybe I'm just not giving it enough time to sync back up. We will probably be rebooting the replica server later this week or first of next week if there are updates so I'll watch this log and see what happens and update then.

    Thanks,

    Richard

    Monday, August 12, 2013 5:40 PM
  • I guess I just wasn't waiting long enough. About 20 minutes after everything came back up they started replicating again.

    Thanks,

    Richard

    Saturday, August 24, 2013 7:36 PM