locked
Exchange 2010 DAG unable to Ping and Backup fails RRS feed

  • Question


  • Hi,

    I have some problems pinging my DAG. I have 3 servers in 2 different physical locations and 2 separate Subnets.

    It seems that DAG is functioning correctly, but we are going to be using DAG-Aware backup software, so I believe I need to get these nodes going in the other 2 sites.

    When I go into Failover Cluster Manager - and I view Cluster Core Resources - I see the following:

    Name: DAG - Online (Dag Name Resolution Not Yet Available)
    IP Address:  [Site A.10.65.4.108] - Offline
    IP Address:  [Site B.10.65.10.108] - Online

    Site A is Primary
    Site B is DR Site

    There's 3 Exchange servers, let's just call them SITEA-MB01 & MB02, SITEB-DRMB03.  All 3 servers have databases and users in said databases.  (And I've had DAG working between the sites no problem).

    Additionally:

    If I try to bring the Address 10.65.4.108 online, I get an error message "The operation has failed" an error occurred while attempting to bring the resource "IPv4 DHCP Address 1 (Cluster Group)" online   DETAILS:  Error 0x80071397 - The operation failed because either the specified cluster node is not the owner of the resource, or the node is not a possible owner of the resource.

    The event log shows no other errors than the one above (Event 1069)
    I have tried rebooting the servers several times.


    Regards, Kiran B.
    Thursday, September 8, 2011 4:16 AM

Answers

  • Hi,

     

    Its solved I did a repair of the cluster object in managment console as below.


    Regards, Kiran B.
    Friday, September 16, 2011 6:19 AM

All replies

  • Hi,
    Only one of the two IP Addresses in the cluster, can be online at any time.
    Right now the DAG (Cluster Group) is active on your server in Site B.
    If you want it to be online on server MB01, just use EMS or CMD and type cluster group "cluster group" /moveto:MB01

    Regarding ping...
    It could be the Windows Firewall that is blocking it.


    Martina Miskovic
    Thursday, September 8, 2011 4:39 AM
  • Hi,

    Firewall is disabled on all nodes.

    On cluster manager its shown as 

    Name: DAG - Online (Dag Name Resolution Not Yet Available)


    Regards, Kiran B.
    Thursday, September 8, 2011 4:57 AM
  • Not sure what you mean with "Dag Name Resolution Not Yet Available", but make sure that the DAG Name isn´t created as a static record in DNS.

    Martina Miskovic
    Thursday, September 8, 2011 5:00 AM
  • Its shown in the Failover cluster management console against the DAG name.
    Regards, Kiran B.
    Thursday, September 8, 2011 5:05 AM
  • Hi Kiran,
    Are you still having problems with backup?

    • Did you check the DNS Record for you DAG Name?
    • Are the network bindings correct on the servers (MAPI Network on top of the list)?
    • Did you move the cluster group to SiteA ?

     


    Martina Miskovic
    Saturday, September 10, 2011 8:37 AM
  • In addition to Martina's questions, ensure you are allowing Dynamic DNS registration on all the DAG members NICs.
    Saturday, September 10, 2011 12:45 PM
  • Hi,

     

    In the cluster its still shown as name resolution not available.

    DNS name do exists.

    Binding are correct.

    How do u allow dynamic dns registration on DAG nics


    Regards, Kiran B.
    Tuesday, September 13, 2011 8:08 AM
  • Hi,

    It´s the Serve hosting the "Cluster Group" (DAG) that registers the DAGName in DNS.
    I would check the record in DNS and make sure that it isn´t a static record.


    Martina Miskovic
    Tuesday, September 13, 2011 8:10 AM
  • System error 5013 has occurred (0x00001395).
    The cluster group could not be found.

    is the error i get on moving the dag.

    dns time stamp is static.


    Regards, Kiran B.
    Tuesday, September 13, 2011 8:26 AM
  • Which of the servers has the Cluster Group, server in SiteA or SiteB?
    Run Cluster Group "cluster group" and find out.

    If it is in failed state, move it to another cluster node
    cluster group "cluster group" /moveto:MB01


    Martina Miskovic
    Tuesday, September 13, 2011 8:31 AM
  • Group                Node            Status
    -------------------- --------------- ------

    System error 5013 has occurred (0x00001395).
    The cluster group could not be found.

    the error even while running Cluster Group "cluster group" on the failover cluster mgmt. cluster resource for siteb is online and sitea is offline and DAG status is online


    Regards, Kiran B.
    Tuesday, September 13, 2011 8:44 AM
  • Hi,
    Did you run the command from an elevated prompt (Run as Administrator)?


    Martina Miskovic
    Tuesday, September 13, 2011 8:48 AM
  • Yes....
    Regards, Kiran B.
    Tuesday, September 13, 2011 8:59 AM
  • Im not sure what you really mean with:
    >Cluster Group "cluster group" on the failover cluster mgmt. cluster resource for siteb is online and sitea is offline and DAG status is online

    The Cluster Group "Cluster Group" can only be online at one server.
    Have you run: Test-ReplicationHealth

    If there is something wrong with the cluster service, you should see it in Event Viewer. Any error or warnings there?
    Martina Miskovic
    Tuesday, September 13, 2011 9:06 AM
  • all are healthy.

    I guess my problem is that DAG is not able to resolve in DNS

     


    Regards, Kiran B.
    Tuesday, September 13, 2011 9:13 AM
  • Can you run ipconfig /all and post the output on the node that has the Cluster Group Online?
    Martina Miskovic
    Tuesday, September 13, 2011 9:15 AM
  • Windows IP Configuration

       Host Name . . . . . . . . . . . . : SRVDRMB03
       Primary Dns Suffix  . . . . . . . : CEAT.LOCAL
       Node Type . . . . . . . . . . . . : Hybrid
       IP Routing Enabled. . . . . . . . : No
       WINS Proxy Enabled. . . . . . . . : No
       DNS Suffix Search List. . . . . . : CEAT.LOCAL

    Ethernet adapter Local Area Connection* 9:

       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft Failover Cluster Virtual Adapte
    r
       Physical Address. . . . . . . . . : 02-15-5D-0A-6E-0B
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes
       Link-local IPv6 Address . . . . . : fe80::acea:f377:7014:2b9a%23(Preferred)
       IPv4 Address. . . . . . . . . . . : 169.254.3.149(Preferred)
       Subnet Mask . . . . . . . . . . . : 255.255.0.0
       Default Gateway . . . . . . . . . :
       DHCPv6 IAID . . . . . . . . . . . : 520230237
       DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-14-AC-F0-D0-00-15-5D-0A-6E-03

       DNS Servers . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                           fec0:0:0:ffff::2%1
                                           fec0:0:0:ffff::3%1
       NetBIOS over Tcpip. . . . . . . . : Enabled

    Ethernet adapter DR DAG Repli:

       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft Virtual Machine Bus Network Ada
    pter #9
       Physical Address. . . . . . . . . : 00-15-5D-0A-6E-0E
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes
       IPv4 Address. . . . . . . . . . . : 192.168.2.2(Preferred)
       Subnet Mask . . . . . . . . . . . : 255.255.255.0
       Default Gateway . . . . . . . . . :
       NetBIOS over Tcpip. . . . . . . . : Enabled

    Ethernet adapter DR DAG Public:

       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft Virtual Machine Bus Network Ada
    pter #7
       Physical Address. . . . . . . . . : 00-15-5D-0A-6E-0B
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes
       IPv4 Address. . . . . . . . . . . : 10.64.10.106(Preferred)
       Subnet Mask . . . . . . . . . . . : 255.255.255.0
       IPv4 Address. . . . . . . . . . . : 10.64.10.108(Preferred)
       Subnet Mask . . . . . . . . . . . : 255.255.255.0
       Default Gateway . . . . . . . . . : 10.64.10.1
       DNS Servers . . . . . . . . . . . : 10.64.10.101
                                           10.65.4.101
       NetBIOS over Tcpip. . . . . . . . : Enabled

    Tunnel adapter isatap.{16FE241A-64D8-409C-849D-3FAA5D7A094D}:

       Media State . . . . . . . . . . . : Media disconnected
       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft ISATAP Adapter
       Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes

    Tunnel adapter Teredo Tunneling Pseudo-Interface:

       Media State . . . . . . . . . . . : Media disconnected
       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
       Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes

    Tunnel adapter isatap.{E453CD77-AE63-4DA9-AF57-1A33C810F97F}:

       Media State . . . . . . . . . . . : Media disconnected
       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft ISATAP Adapter #2
       Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes

    Tunnel adapter isatap.{57F1F3E6-F572-48B9-AD11-805BFE48160F}:

       Media State . . . . . . . . . . . : Media disconnected
       Connection-specific DNS Suffix  . :
       Description . . . . . . . . . . . : Microsoft ISATAP Adapter #3
       Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
       DHCP Enabled. . . . . . . . . . . : No
       Autoconfiguration Enabled . . . . : Yes


    Regards, Kiran B.
    Tuesday, September 13, 2011 9:19 AM
  • Hi,
    I can´t see anyting wrong.

    Just to be sure, the NIC DR DAG Public is configured on top of the network bindings?
    Register this connection´s addresses in DNS is checked for Tcp/Ip ?
    You are allowing dynamic updates in DNS?
    Is the DAG-name registered in DNS with 10.65.4.108 or 10.65.10.108


    Martina Miskovic
    Tuesday, September 13, 2011 9:49 AM
  • Public NIC is on Top.

    Register this connection´s addresses in DNS is checked for Tcp/Ip ? YES


    You are allowing dynamic updates in DNS? How to check this


    Is the DAG-name registered in DNS with 10.65.4.108 or 10.65.10.108 .  10.65.4.108  on DNS is registered.


    Regards, Kiran B.
    Tuesday, September 13, 2011 9:53 AM
  • Since the DAG is online in SiteB, then the wrong IP Address is now registered in DNS.
    It should be 10.65.10.108.

    I would remove the entry in DNS and then try to bring the cluster resource online.

    Setting for Dynamic Updates: You will find the settings on the general tab for your DNS-Zone CEAT.LOCAL





    Martina Miskovic
    Tuesday, September 13, 2011 10:00 AM
  • Getting the bleow error while bringing the IP online on SITE A

     

    [Window Title]
    Error

    [Main Instruction]
    The operation has failed.

    [Content]
    An error occurred while attempting to bring the resource 'Cluster IP Address' online.

    [^] Hide details  [OK]

    [Expanded Information]
    Error Code: 0x80071397
    The operation failed because either the specified cluster node is not the owner of the resource, or the node is not a possible owner of the resource


    Regards, Kiran B.
    Tuesday, September 13, 2011 10:15 AM
  • So, is the Cluster Group in SiteA now??
    Only one of the IP Adresses can be online at any given time.
    Martina Miskovic
    Tuesday, September 13, 2011 10:19 AM
  • No its still in SITE B. its the error i get when i try to bring SITE A cluster IP online.
    Regards, Kiran B.
    Tuesday, September 13, 2011 10:35 AM
  • The IP Address 10.65.4.108 can only be in an online state WHEN the Cluster Group is online on one of the servers in SiteA.

    Martina Miskovic
    Tuesday, September 13, 2011 10:45 AM
  • Hi,

    I tried moving the group from site A to Site B it kept on trying for few minutes and kept failing and later on it was on SITE B only.


    Regards, Kiran B.
    Tuesday, September 13, 2011 10:48 AM
  • Delete the record for the DAG in your DNS and wait for replication to take place.
    Then try and move the group again.


    Martina Miskovic
    Tuesday, September 13, 2011 10:51 AM
  •  10.65.4.108 is online but the dag name is in failed state, dns is not visibel agaist the dag name and 10.65.4.108
    Tuesday, September 13, 2011 10:57 AM
  • Hmm, that sound very strange in my ears.

    10.65.4.108 belongs to SiteA and the node you have in SiteB, should not be able to bring that one online.
    Again, I would remove the record from DNS from both the Reverse- and Forward Zone. Then run ipconfig /registerdns


    Martina Miskovic
    Tuesday, September 13, 2011 11:04 AM
  • Hmm, that sound very strange in my ears.

    10.65.4.108 belongs to SiteA and the node you have in SiteB, should not be able to bring that one online.
    Again, I would remove the record from DNS from both the Reverse- and Forward Zone. Then run ipconfig /registerdns


    Martina Miskovic
    Please update

    --------Abhi----------------- Exchange Specialist------------- ------------------ Please remember to click “Mark as Answer” on the post that helps you. This can be beneficial to other community members reading the thread.
    Tuesday, September 13, 2011 11:10 AM
  • @Abhi,
    It was only 12 minutes ago I posted that.
    There no need to ask for an update!!


    Martina Miskovic
    Tuesday, September 13, 2011 11:17 AM
  • @Abhi,
    It was only 12 minutes ago I posted that.
    There no need to ask for an update!!


    Martina Miskovic

    is it board rule that we cant ask for update?

    and martina.. please dont ever challenge my post. i have seen so many times you doing this


    --------Abhi----------------- Exchange Specialist------------- ------------------ Please remember to click “Mark as Answer” on the post that helps you. This can be beneficial to other community members reading the thread.
    Tuesday, September 13, 2011 11:23 AM
  • Guys pls, lets focus on the issue..
    Regards, Kiran B.
    Tuesday, September 13, 2011 11:28 AM
  • Hi,

    For some time the my SITE A was online though DAG was shown as failed state. But the DAG IP was able to ping. But after few hours again my SITE B was online and DAG was Online but ping stopped and DNS entries are also not present.


    Regards, Kiran B.
    Wednesday, September 14, 2011 4:36 AM
  • any idea on the above issue


    Regards, Kiran B.
    Thursday, September 15, 2011 4:25 AM
  • Did you ever try my suggestions above and delete the DNS-Records?
    Martina Miskovic
    Thursday, September 15, 2011 4:27 AM
  • Yup but no luck
    Regards, Kiran B.
    Thursday, September 15, 2011 4:31 AM
  • my earlier post

     

    Hi,

    For some time the my SITE A was online though DAG was shown as failed state. But the DAG IP was able to ping. But after few hours again my SITE B was online and DAG was Online but ping stopped and DNS entries are also not present.


    Regards, Kiran B.
    Thursday, September 15, 2011 4:32 AM
  • Yup but no luck
    Regards, Kiran B.


    Ok, so you did delete both the A and PTR Record?
    Is the service "DHCP Client" set to an Automatic State and started on all nodes?

    Could you post some pictures of

    • Your network bindings
    • Output from cluster group "cluster group"
    • Windows Failover Cluster

    ...use Skydrive or similar.

     


    Martina Miskovic
    Thursday, September 15, 2011 4:41 AM

  • Regards, Kiran B.
    Thursday, September 15, 2011 4:55 AM
  • Hi,
    Can you also post the output from CMD--> cluster group "cluster group" /move
    Do you have any entries in the Host-file?
    Was the service "DHCP Client" started or not?


    Martina Miskovic
    Thursday, September 15, 2011 5:04 AM
  • not host file entry.

    DHCP client Started.

     

    C:\Users\administrator.CEAT>cluster group "cluster group" /move

    Moving resource group 'cluster group'...

    Group                Node            Status
    -------------------- --------------- ------
    cluster group        SRVDRMB03       Online

    It keeps trying to move to site A multiple times and fails. And is online on SITE B which is DR.


    Regards, Kiran B.
    Thursday, September 15, 2011 5:19 AM
  • Does it matter if you speciy which node you want to move the cluster group to?
    (cluster group "cluster group" /moveto:MB01)

    Any information in Event Viewer (check all nodes)?


    Martina Miskovic
    Thursday, September 15, 2011 5:23 AM
  • You have some recent cluster error events shown in Failover Cluster Manager. What are they?
    Martina Miskovic
    Thursday, September 15, 2011 5:25 AM
  • Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          15-09-2011 10:50:09
    Event ID:      1069
    Task Category: Resource Control Manager
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      SRVHOMB01.CEAT.LOCAL
    Description:
    Cluster resource 'IPv4 Static Address 1 (Cluster Group)' in clustered service or application 'Cluster Group' failed.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1069</EventID>
        <Version>0</Version>
        <Level>2</Level>
        <Task>3</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2011-09-15T05:20:09.166151000Z" />
        <EventRecordID>88388</EventRecordID>
        <Correlation />
        <Execution ProcessID="6228" ThreadID="5000" />
        <Channel>System</Channel>
        <Computer>SRVHOMB01.CEAT.LOCAL</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="ResourceName">IPv4 Static Address 1 (Cluster Group)</Data>
        <Data Name="ResourceGroup">Cluster Group</Data>
      </EventData>
    </Event>

     

     


    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          15-09-2011 10:50:20
    Event ID:      1146
    Task Category: Resource Control Manager
    Level:         Critical
    Keywords:     
    User:          SYSTEM
    Computer:      SRVHOMB01.CEAT.LOCAL
    Description:
    The cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1146</EventID>
        <Version>0</Version>
        <Level>1</Level>
        <Task>3</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2011-09-15T05:20:20.777163800Z" />
        <EventRecordID>88389</EventRecordID>
        <Correlation />
        <Execution ProcessID="6228" ThreadID="1032" />
        <Channel>System</Channel>
        <Computer>SRVHOMB01.CEAT.LOCAL</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="NodeName">SRVHOMB01</Data>
      </EventData>
    </Event>

     

     

    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          15-09-2011 10:50:20
    Event ID:      1069
    Task Category: Resource Control Manager
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      SRVHOMB01.CEAT.LOCAL
    Description:
    Cluster resource 'Cluster Name' in clustered service or application 'Cluster Group' failed.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1069</EventID>
        <Version>0</Version>
        <Level>2</Level>
        <Task>3</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2011-09-15T05:20:20.839588600Z" />
        <EventRecordID>88390</EventRecordID>
        <Correlation />
        <Execution ProcessID="6228" ThreadID="1032" />
        <Channel>System</Channel>
        <Computer>SRVHOMB01.CEAT.LOCAL</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="ResourceName">Cluster Name</Data>
        <Data Name="ResourceGroup">Cluster Group</Data>
      </EventData>
    </Event>

     

     

     


    Regards, Kiran B.
    Thursday, September 15, 2011 5:34 AM
  • if i do a node specific:

     

    C:\Users\administrator.>cluster group "cluster group" /moveto:srvhomb02

    Moving resource group 'cluster group'...

    Group                Node            Status
    -------------------- --------------- ------
    cluster group        SRVHOMB02       Failed


    C:\Users\administrator.>cluster group "cluster group" /moveto:srvhomb01

    Moving resource group 'cluster group'...

    Group                Node            Status
    -------------------- --------------- ------
    cluster group        SRVHOMB01       Failed


    Regards, Kiran B.
    Thursday, September 15, 2011 6:23 AM
  • Hi,

     

    Its solved I did a repair of the cluster object in managment console as below.


    Regards, Kiran B.
    Friday, September 16, 2011 6:19 AM
  • Thanks for sharing the resolution.

    But this is very strage ;)

    Cheers,


    Gulab | MCITP: Exchange 2010-2007 | Lync Server 2010 | Windows Server 2008 | Skype: Exchange.Ranger | Blog: www.ExchangeRanger.Blogspot.com
    Friday, September 16, 2011 6:26 AM
  • Hi Kiran,
    Thanks for sharing your solution with us!
    Im glad that your problem is solved.


    Martina Miskovic
    Friday, September 16, 2011 6:26 AM
  • Hi Martina,

    Thanks for your continous support.


    Regards, Kiran B.
    Friday, September 16, 2011 6:28 AM
  • Thank you Martina!  I had the same issue and used your suggestion of cluster group "cluster group" /move and I am back in business.

     

    Thank you again!!

    • Proposed as answer by jumbi Sunday, March 16, 2014 12:12 AM
    • Unproposed as answer by jumbi Sunday, March 16, 2014 12:13 AM
    Monday, February 6, 2012 8:33 PM