locked
Ex 2007 SCC with SCR standalone target - issues failing back RRS feed

  • Question

  • I've read the following post and added my own question to the end but realised it then appears as answered so I will post my own question here separately. http://social.technet.microsoft.com/Forums/en-US/exchangesvravailabilityandisasterrecovery/thread/a4443577-7a43-46c0-a3c3-2829218e78fe

    I have just come across the same issue which I was completely unaware of when designing this DR solution.

    Scenario:

    Exchange 2007 SP2 UR4 SCC (CMSname=EXMB01, Windows failover cluster name = EXCLUS01) with standalone SCR target (not standby cluster) (SCR01)

    I have failed over 1 of the databases as a test which worked perfectly.

    I've been following Scott's blog http://blogs.technet.com/scottschnoll/archive/2007/11/26/after-scr-activation-what-next.aspx to switch back again but when I attempt to Enable-StorageGroupCopy back to the SCC for this single database I get the following misleading error:

    Enable-StorageGroupCopy SCR01\SG1Port -StandbyMachine EXMB01
    Enable-StorageGroupCopy : Standby continuous replication cannot be configured on node 'EXMB01.<domainName>' because it is not a member of cluster 'EXCLUS01'.

    In a standalone server to standalone server SCR scenario it is possible to failover and failback single databases. Is this not possible when using an SCC source? Is there a technical limitation for this?

    Please help - I'm nearly at the end of my deployment and cannot believe I've run into such a major limitation.

    Tuesday, May 18, 2010 5:06 PM

Answers

  • Issue description: After run the SCR target activation successfully, SCR can’t be enabled back from DR site to primary site with the following error information

    Enable-StorageGroupCopy : Standby continuous replication cannot be configured on node 'The Name Of Exchange CMS' because it is not a member of cluster 'The Name Of Windows Cluster'.

    Conclusion: Such action can’t be performed

    Explanation: The endpoint for SCR is called the target, and the target can be either of the following:

    ·         Stand-alone Mailbox server that does not have LCR enabled for any storage group

    ·         Passive node in a failover cluster where the Mailbox role is installed, but no clustered mailbox server has been installed in the cluster

    In this issue, you have EXMB01 (CMS) on the SCC cluster, so the nodes in it can’t be a target to enable the SCR

    Workaround: Perform a Move-Mailbox back to the primary site/server is the easiest way to “roll back”. Outlook users need to restart the clients after perform the Move-Mailbox

    Notes: For the database level HA, CCR would be the good option


    James Luo
    TechNet Subscriber Support (http://technet.microsoft.com/en-us/subscriptions/ms788697.aspx)
    If you have any feedback on our support, please contact tngfb@microsoft.com
    • Proposed as answer by Alan.Gim Monday, May 24, 2010 1:28 AM
    • Marked as answer by -Mike Tuesday, May 25, 2010 2:10 PM
    Thursday, May 20, 2010 3:35 AM

All replies

  • What's EXCLUS01? Is it the CMS or the cluster? Was the CMS cleared off of the cluster?
    Tuesday, May 18, 2010 7:11 PM
  • Agree with Jader3rd. Have the CMS configurations been removed from the nodes?

    Please refer to the posts from ibenna in this thread

    Resources:

     Setup Updates for SCR” section in Standby Continuous Replication


    James Luo
    TechNet Subscriber Support (http://technet.microsoft.com/en-us/subscriptions/ms788697.aspx)
    If you have any feedback on our support, please contact tngfb@microsoft.com
    Wednesday, May 19, 2010 3:29 AM
  • Thanks guys.

    EXCLUS01 is the windows failover cluster name (the name of the actual windows cluster not the CMS) which I'm finding very odd.

    EXMB01 is the CMS name.

    No I haven't cleared CMS because I only failed over a single SG and now am trying to fail it back.

    I did read that ibenna thread too, thanks, and it also makes me think that the only way to SCR the SG back is to clearcms! But that means that I have to first fail over all SGs to the SCR target, then clearcms, then SCR them back and then fail back the whole server again. I wanted the fexibility to be able to fail over and fail back individual SGs, that's why I went with a standalone SCR target and not a standby cluster. Is this not possible when the original source is clustered??

    Wednesday, May 19, 2010 7:54 AM
  • SCR is designed for failing over an entire server.
    Wednesday, May 19, 2010 3:48 PM
  • Wow. Very broad and unfounded statement Jader3rd.

    Check TechNet:

    http://technet.microsoft.com/en-us/library/bb676502(EXCHG.80).aspx

    "If one or some of the databases on the SCR source are affected, you can use the database portability feature in Exchange 2007 as part of the activation process for the SCR target databases."

    Perhaps I shoud have clearly stated I'm using database portability - I did mention it was a standalone target and not a standby cluster therefore database portability should be implied by default. I chose the database portability option specifically because it gives the flexibility of being able to failover individual databases.

    Anyway, after lots more reading I think I understand now that there is a limitation on SCR which means you cannot have an SCR target which is a cluster (unless it is a standby cluster) and therefore, when using a cluster as a source for SCR, it is not possible to use SCR to fail back to that cluster unless you destroy it and start again.

    As Scott says in his blog (linked in original post), the ways to get back are:

    1) dismount the target database and manually copy it back to source and then  use database portability to rehome users on it. This obviously means lengthy downtime.

    2) Use move-mailbox to move users back. Shorter downtime but lengthier overall operation.

    3) Option 3 (SCR back again) will not work if the original source is a cluster unless you blow it all away.

     

    I just think it should be a lot clearer in the documentation. E.g. in that Technet link above:

    "In this example, a single storage group is being replicated to multiple SCR targets on two different computers. Recovery of the storage group on either SCR target can be achieved by using either of two methods:

    • /RecoverCMS can be used when recovering storage groups from a single CCR source only.
    • Database portability can be used when recovering storage groups from multiple CCR sources. "

    Yes you can use database portability but there is no mention of the extended downtime involved in getting back again once you've recovered from your disaster.

    I would also really like to understand if this is a real technical limitation or just an untested scenario. I cannot see why there should be the restriction.

    • Edited by -Mike Tuesday, May 25, 2010 2:19 PM Put quotes around the bits I had extracted from TechNet to make it clearer. No other changes.
    Wednesday, May 19, 2010 6:07 PM
  • Those are the three options for getting back to the original state. Yes, they are all awkward, and not ideal. The reason is because SCR was not designed for this scenario. SCR was designed for recovering an entire CCR CMS should the original site (or CMS nodes) not be able to provide service. Other scenarios that SCR can assist with are benefits, but the entire motivation for SCR was to make recovering a CCR CMS a quick process.
    Thursday, May 20, 2010 2:40 AM
  • Issue description: After run the SCR target activation successfully, SCR can’t be enabled back from DR site to primary site with the following error information

    Enable-StorageGroupCopy : Standby continuous replication cannot be configured on node 'The Name Of Exchange CMS' because it is not a member of cluster 'The Name Of Windows Cluster'.

    Conclusion: Such action can’t be performed

    Explanation: The endpoint for SCR is called the target, and the target can be either of the following:

    ·         Stand-alone Mailbox server that does not have LCR enabled for any storage group

    ·         Passive node in a failover cluster where the Mailbox role is installed, but no clustered mailbox server has been installed in the cluster

    In this issue, you have EXMB01 (CMS) on the SCC cluster, so the nodes in it can’t be a target to enable the SCR

    Workaround: Perform a Move-Mailbox back to the primary site/server is the easiest way to “roll back”. Outlook users need to restart the clients after perform the Move-Mailbox

    Notes: For the database level HA, CCR would be the good option


    James Luo
    TechNet Subscriber Support (http://technet.microsoft.com/en-us/subscriptions/ms788697.aspx)
    If you have any feedback on our support, please contact tngfb@microsoft.com
    • Proposed as answer by Alan.Gim Monday, May 24, 2010 1:28 AM
    • Marked as answer by -Mike Tuesday, May 25, 2010 2:10 PM
    Thursday, May 20, 2010 3:35 AM
  • Won't fight with you Jader3rd.

    Did you write/design the SCR code?

    Can you point to an article to back up your claims of what the designers had in mind when they designed it?

     

    Tuesday, May 25, 2010 2:09 PM
  • Thanks James-Luo.

    I just don't agree with your final note which strays off topic. CCR is server level HA like SCC, not database level HA.

    I am using SCC for HA within the primary datacentre (CCR would have been fine too and was in fact the original design but the customer could not justify the cost in SAN disks for this particular solution) and SCR for DR (because it is not a HA solution).

    Tuesday, May 25, 2010 2:16 PM