Hyper-V Replication Issue 2012 R2 - Happens with large backlog of data


  • We have a 2012 R2 Hyper-V Failover Cluster with Replica's to another 2012 R2 Failover Cluster.    Everything works as expected for the most part, we have over 100 VM's replicating and no issues starting the initial replication to completion.   BUT, we had 1, now 2 VMs which are having some weirdness.   Their replication will just stop working with a generic "One or more arguments are invalid (0x80070057)" Error.   It will always be 3 events in succession, Hyper-V-VMMS Errors 33680, 32086 and 32022.

    I've narrowed down the specific circumstances of the issue, just not why it's doing it.   It happens whenever a VM happens to get a significant backlog of data due to a lot of changes happening all at once.  For one of the VMs (the first one that's been happening for quite a while and we've just lived with it), when it happened it was always at the start of the week right when the server started doing it's De-duplication job (on the VM, not the host), so this caused a lot of changes to occur quickly.   The other server just started happening everyday after a backup job writes about 600GB of data to it.   Usually when I go to resume the replication, it kicks it off again and works fine, though I've had occasions I've had to do it twice.     I'll go look at the HRL files on the host when this happens, usually they are 100-200 GB in size, and sometimes multiple files even just sitting there waiting (split into the 15 minute increments of time for the replication).   Again, it usually resumes fine as well.  It's not really a performance issue, as the system is able to finish the process when resumed.

    I'm going to try setting the replication period to just 5 minutes and see if that changes anything, maybe it'll make the HRL files smaller so it won't get hung-up, but I wanted to post this for others who may have ran into the issue and found a fix.

    Thanks for your time!

    16 พฤษภาคม 2561 18:49