locked
Storage Replica - failback steps after recovering primary site RRS feed

  • Question

  • Hi,

    I just deployed 4-node cluster for testing storage replica. 2 nodes are in primary site, 2 nodes in secondary. One set of Data/Log disks is available in primary, other in secondary site. After setting storage replica Primary -> Secondary I simulated primary site failure (shutting down both nodes). Now I'm trying to figure out exact way to move back to primary site after I recover it.

    What should be the steps?

    For example:

    • Change source and destination
    • Bring online primary site
    • Wait for replica to sync changes made in secondary
    • Failover primary disks in primary site
    • Change source and destination once again

    Can it be done in easier way and am I missing something?

    Thanks.

    Friday, December 12, 2014 9:05 AM

Answers

  • In some private conversation with Ned he pointed out failback will happen automatically, it is just not yet in publicly available build.

    To quote the Ned:
    "Currently, you have to move-clustergroup to place the SR group (i.e. the destination disk) back onto the destination instead of the failed to node owning both source and destination; we have fixed this in later internal builds where it will happen automatically. When you do that the destination will sync any delta from the source log, or perform a full sync if the log has wrapped on the source.
    If you want to move the source back to the original node, you would do the same thing (using move-clustersharedvolume). That would leave the destination in place."

    Best regards.


     


    • Edited by LjuboB Friday, January 30, 2015 7:31 AM
    • Marked as answer by LjuboB Friday, January 30, 2015 7:32 AM
    Friday, January 2, 2015 10:51 AM

All replies

  • Hi Ljubo,

    You just need to move the cluster group. Some roles support this with the UI in tech preview cluadmin (like the file server role and a CSV storage), and some require Windows PowerShell (like a single VM using a non-CSV) - which works in all scenarios. For the latter you can just:

     get-clustergroup

    figure out what you want to move

     move-clustergroup <yourthinghere>


    Ned Pyle [MSFT] | Sr. Program Manager for Storage Replica, DFS Replication, Scale-out File Server, SMB 3, other stuff

    • Proposed as answer by IJSingh Friday, January 2, 2015 11:04 AM
    Tuesday, December 16, 2014 9:38 PM
  • Hi Ned,

    Thanks for answering, but I think I should explain a bit better what I did and what I experienced.  I'll try to be really short and precise.

    I have 2 nodes (SR-NODE4 and SR-NODE5) in primary site and 2 nodes (SR-NODE6 and SR-NODE7) in secondary site. I created File Server on CSV. When all nodes are up in FCM you can see this:

    After I shut down the nodes in primary site in FCM you can see this:

    At this moment everything is OK, I can access all my data on CSV as expected. File Server role is owned by either SR-NODE6 or SR-NODE7.  I changed the content of one of the files on the share, brought up the primary site and move the role to SR-NODE4. Disks remained as shown in screenshot 2. Then I manually moved from FCM Cluster Disk 1 back to SR-NODE4. All disks become online and owned by respective site nodes. 

    NOTE: Now I reopened changed document from share, but changes were lost. File content was as when I brought down the primary site.

    I was expecting this since we are discussing replica, not a two way sync mechanism. This is why I asked the question how to failback. If I got it completely wrong, please correct me.

    Thanks.


    • Edited by LjuboB Wednesday, December 17, 2014 6:57 PM Content update after additional testing
    Tuesday, December 16, 2014 11:15 PM
  • Hmmm. Do you have a place you can upload logs to and share with us? If so, please email me at nedpyle @ microsoft.com and we can collect some data.

    Ned Pyle [MSFT] | Sr. Program Manager for Storage Replica, DFS Replication, Scale-out File Server, SMB 3, other stuff

    Friday, December 19, 2014 5:19 PM
  • Hi Ljubo Brodaric,

    How about this issue now? Could you share your solution to us?

    Best Regards.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com

    Wednesday, December 24, 2014 9:24 AM
  • In some private conversation with Ned he pointed out failback will happen automatically, it is just not yet in publicly available build.

    To quote the Ned:
    "Currently, you have to move-clustergroup to place the SR group (i.e. the destination disk) back onto the destination instead of the failed to node owning both source and destination; we have fixed this in later internal builds where it will happen automatically. When you do that the destination will sync any delta from the source log, or perform a full sync if the log has wrapped on the source.
    If you want to move the source back to the original node, you would do the same thing (using move-clustersharedvolume). That would leave the destination in place."

    Best regards.


     


    • Edited by LjuboB Friday, January 30, 2015 7:31 AM
    • Marked as answer by LjuboB Friday, January 30, 2015 7:32 AM
    Friday, January 2, 2015 10:51 AM