none
Server 2008 R2 AD Replication issues

    Question

  • Hi Guys,

    We have an 2008 R2 multi-site single domain and are having issues with replication. We have a IPSEC tunnel over a WAN seperating the locations all on separate subsets.

    In the central location, we have two domain controllers (DC1 & DC2) and they can replicate fine. When I create StoreDC1, it was created in the central location and was replicating fine.

    When it was moved to site, the forwarders were set up correctly, the Secondary DNS was set to DC1 and DNS on both ends resolves the new, correct IP addresses. I can not confirm if there has been any outbound replication since the server was brought online on site, but I can confirm there has been some inbound.

    The problem seemed to occur with the amount of time the tunnel was down before the servers were able to contact each other. It was around two months.

    Something that stands out to me is the error I get when UNC browsing from StoreDC1 to DC1. I get "Logon Failure: The target account name is incorrect". UNC browsing from DC1 to StoreDC1 works fine.

    When running Repadmin /replsummary from DC1, it shows no fails to any of the replication partners. When you run the same command from StoreDC1, you get "experienced the following operational errors trying to retrieve replication information:

    8341 - DC1.Domain.local

    8341 - DC2.Domain.local"

    When running Repadmin /showrepl *, I get the same errors from both DC1 and StoreDC "The target principlal name is incorrect" and "***WARNING: KCC could not add this REPLICA LINK due to error".

    And help would be appreciated, and if you need more indepth error information I would be happy to provide it.

    Josh.

    Sunday, December 04, 2011 1:45 PM

Answers

  • Hi All,

    Again, big thanks for all your help.

    I am working through the lists above and just wanted to add some extra notes whilst I am.

    The issue of UNC browsing, is only by name. If I UNC browse to DC1 from StoreDC1, I get the error "The target account name is incorrect", but when I UNC browse via IP address (e.g. \\10.4.1.5) it connects fine.

    Why would this work via IP address but not name? The DNS resolves correctly via ping.


    I would like to point out that if you are getting this error using the computer's name in the UNC path but not with using the IP address, that name resolution with UNCs are based on DirectSMB and NetBIOS. In this case, credentials are based on what NetBIOS has from the Browser service.

    Here's a thread that seems pertinent to what you're seeing (you may need an account to view it, or you can copy the link below, paste it in a Bing or Google search, and click on the first result to get into it):
    http://www.experts-exchange.com/Networking/Windows_Networking/Q_23063424.html

    Since we haven't seen configuration information, such as an ipconfig /all from each DC, any and all event log errors (eventID#s and source names in the event), as well as results for the repadmins, if using WINS (if the old name is in the database), if the DCs are multihomed, RRAS on them, etc, it will be difficult to nail down. This sort of info in many cases will help us come up with a diagnosis, otherwise, we can just offer generalized possibilities.

    If you can elaborate and offer some of this, it may help to come up with a specific diagnosis.


    Ace Fekay
    MVP, MCT, MCITP EA, MCTS Windows 2008 & Exchange 2007 & Exchange 2010, Exchange 2010 Enterprise Administrator, MCSE & MCSA 2003/2000, MCSA Messaging 2003
    Microsoft Certified Trainer
    Microsoft MVP - Directory Services
    Complete List of Technical Blogs: http://www.delawarecountycomputerconsulting.com/technicalblogs.php

    This posting is provided AS-IS with no warranties or guarantees and confers no rights.

    FaceBook Twitter LinkedIn
    Saturday, December 17, 2011 5:28 PM

  • Since while you are acessing the DC UNC path and you are getting target principle name incorrect this indicates that secure channel beetween the DC are broken.

    Reference link:http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/


    This issue typically indicates a Kerberos authentication problem, although there are several exceptions.These steps outline how to resolve the authentication failure.

    To verify the Access this computer from network user right.
    Everyone, Authenticated Users, and Enterprise Domain Controllers must have that user right for successful replication.

    Important: 
    There are customers that are going to remove the Everyone group from Access this computer from the network which is acceptable as long as Authenticated Users and Enterprise Domain Controllers are listed.

    To check the CrashOnAuditFail Registry Key

    Check the <computername>_regentries.txt file in the Directory Services MPSReports to confirm if crashonauditfail [REG_DWORD] = 0x2

    If CrashOnAuditFail = 0x2 perform the following steps

    Type regedit from Start, and then click Run.

    Expand HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\LSA

    Right-click crashonauditfail, select Modify

    Under Value data:, select 2 and change the value to 0

    Reboot domain controller.

    To check the time skew between domain controllers.See Knowledge Base article 257187

    Ensure the Trust computer for delegation check box is selected on the General tab of the domain controller Properties dialog box in Active Directory Users and Computers.

    Using Adsiedit or Ldp (both included in the Windows 2000 Support Tools), confirm that the userAccountControl attribute is set to 532480. To check this, perform the following steps

    Type adsiedit.msc from Start, and then click Run.

    Expand the Domain NC container.

    Expand the object below, i.e. DC=Contoso, DC=COM

    Expand OU=Domain Controllers

    Right-click CN=<domain_controller>, and select Properties

    Under Select a property to view, select userAccountControl and verify the value is 532480

    Note: 
    Check this value for each failing DC account on the local copy of AD for every partner DC. For example if DC-A and DC-B are failing replication, check the above on DC-A’s copy of AD and DC-B’s copy of AD.

    Reset Password and Refresh Kerberos Tickets

    Follow these steps to reset KDC password :-

    1. Stop the Key Distribution Center (KDC) service on Server all Domain controller expect PDC role holder server. To do so, open
    a Command Prompt, type net stop KDC, and press Enter.

    2. Load Kerbtray.exe on problem DC in you case it is Server07. You can do so by clicking Start, clicking Run, and
    then typing c:\program files\resource kit\kerbtray.exe and pressing Enter.You should see a little green ticket icon in your system tray in the lower right corner of your desktop.

    3. Purge the ticket cache on Server7, right-click the green ticket icon in your system tray, and then click Purge Tickets. You should receive a confirmation that your ticket cache was purged. Click OK.

    4. Reset the Server domain controller account password on Server (the PDC
    emulator).

    To do so, open a command prompt and type: netdom /resetpwd /server:server2 /userd:domain.com\administrator /passwordd:password, and then press Enter.

    5. Synchronize the domain. To do so, open a command prompt, type repadmin
    /syncall, and then press Enter.

    6. Start the KDC service on Server7 and all other DC. To do so, open a command prompt, typenet start KDC, and press Enter. This completes the process.

    Reference link:http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/

    Aso ensure the following on DC:
    1. Each DC / DNS server points to its private IP address as primary DNS server and other remote/local DNS servers as secondary in TCP/IP properties.
    2. Each DC has just one IP address and single network adapter is enabled.
    3. Contact your ISP and get valid DNS IPs from them and add it in to the forwarders, Do not set public DNS server in TCP/IP setting of DC.
    4. Once you are done, run "ipconfig /flushdns & ipconfig /registerdns", restart DNS and NETLOGON service each DC.
    Do not put private DNS IP addresses in forwarder list.
    5.Assigning static IP address to DC if IP address is assigned by DHCP server to DC.It is strongly not recommended.


    You can also review the following articles:

    Fixing Replication Connectivity Problems (Event ID 1925)
    http://technet.microsoft.com/en-us/library/cc780728.aspx

    Network connectivity problems can make it impossible for domain controllers to form replication partnerships. Various events and errors can indicate a problem with network connectivity that is preventing replication from occurring.
     
    Troubleshooting Event ID 1311: Knowledge Consistency Checker
    http://support.microsoft.com/kb/214745

    Cause: This behavior can occur if the Knowledge Consistency Checker (KCC) has determined that a site has been orphaned from the replication topology.

    In addition to the articles above, the following articles are useful for troubleshooting AD replication issues:

    Troubleshooting replication
    http://technet.microsoft.com/en-us/library/cc755349.aspx

    How to troubleshoot Event ID 1311 messages on a Windows 2000 domain
    http://support.microsoft.com/kb/307593

    Event ID 1925: Attempt to establish a replication link failed due to connectivity problem
    http://technet.microsoft.com/en-us/library/cc787129.aspx

     Event ID 1311 — KCC Replication Path Computation
    http://technet.microsoft.com/en-us/library/cc756493.aspx

     Hope this helps.

    Regards,
    Sandesh Dubey.
    -------------------------------
    MCSE|MCSA:Messaging|MCTS|MCITP:Enterprise Adminitrator
    My Blog: http://sandeshdubey.wordpress.com
    This posting is provided AS IS with no warranties, and confers no rights.

    Sunday, December 18, 2011 4:03 AM
  • Thanks everyone for all your help.

    I was able to resolve this issue by repairing the secure channel that was broken between the DC's.

    http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/

    This was a very simple fix we performed after hours. After this fix, a repadmin /syncall did not seem to push and pull all replication, but the next day everything had replicated perfectly.

    Just a few notes if anyone else is experiencing this same issue:

    • I could UNC browse by IP address, but not by name from the remote server to the central server
    • I could UNC browse by IP address and name from the central server to the remote server
    • For me, this error started spreading and effecting the remote site users so we needed to resolve ASAP
    • This issue highlighted another issue in sites and services with regards to replication topology. As we have a hub and spoke network, all sites were trying to replicate between each other even though the site links were specified. Later on today we will be unticking the box 'Bridge All Site Links'
    • We have Hamanchi on each DC and so far this has not caused any replication/domain issues we are aware of.
    • Marked as answer by JoshGoldman Tuesday, January 03, 2012 9:43 AM
    Tuesday, January 03, 2012 9:43 AM

All replies

  • Hello,

    Please proceed like that.

    Make the remote DC points to the main office DC / DNS servers in IPv4 settings (One as primary DNS server and the other one as secondary one). Once done, run ipconfig /registerdns and restart netlogon on each DC you have.

    Please also make sure that needed ports for AD replication are not blocked: http://technet.microsoft.com/en-us/library/bb727063.aspx

    Use PortQry v2 for checking.

    If replication issues persists then use Microsoft Skydrive to upload the output of dcdiag > c:\dcdiag.txt and ipconfig /all > c:\ipconfig.txt commands on all DCs you have. Once done, post a link here.

     


    This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

    Microsoft Student Partner 2010 / 2011
    Microsoft Certified Professional
    Microsoft Certified Systems Administrator: Security
    Microsoft Certified Systems Engineer: Security
    Microsoft Certified Technology Specialist: Windows Server 2008 Active Directory, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Network Infrastructure, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Applications Infrastructure, Configuration
    Microsoft Certified Technology Specialist: Windows 7, Configuring
    Microsoft Certified Technology Specialist: Designing and Providing Volume Licensing Solutions to Large Organizations
    Microsoft Certified IT Professional: Enterprise Administrator
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Trainer

    Sunday, December 04, 2011 5:31 PM
  • If it's been more than 2 months, and this domain was upgraded from an original domain created prior to Windows 2003 SP1, then the AD tombstone may be 60 days.

    What event log errors show up on either DC? Perhaps seeing Event IDs 1388, 1988, or 2042?

    Here's more on those Event IDs if you are seeing them and how to recover,if it truly is a Lingering objects due to tombstone expiration issue:

    Fixing Replication Lingering Object Problems (Event IDs 1388, 1988, 2042)
    http://technet.microsoft.com/en-us/library/cc738018(WS.10).aspx

    Event ID 2042: It has been too long since this machine replicated
    This shows you how to recover a DC that has not replicated beyond the Tombstone LIfetime
    http://technet.microsoft.com/en-us/library/cc757610(WS.10).aspx

     

    Keep in mind, too, that if you are getting any of the above events, besides possibly a tombstone expiration scenario, if the DCs were not communicating less than the tombstone, say if the tombstone is 180 days, and you connected them in 2 months, then it could also be caused by a firewall and/or Windows time skew issue. So that leads to a possible time service issue.

     

    Lingering objects

    If you are seeing the above events, and it's not really a firewall problem (which we haven't determined yet), then you may have lingering objects. Lingering objects occur if a domain controller will remain offline long enough to exceed the tombstone lifetime and thereby may retain objects that have been permanently deleted from the directory on all other domain controllers in the domain. Lingering objects, if a DC is reintroduced past the tombstone period (it's point of no return), can cause directory inconsistency and, under certain conditions, these objects can be reintroduced into the directory.

    To determine the Forest's tombstone, run the following:

    dsquery * “cn=directory service,cn=windows nt,cn=services,cn=configuration,dc=<forestDN>” –scope base –attr tombstonelifetime

    If the value returned is <not set>, then it's 60 days. Otherwise, it will be 180 days.

    The Tombstone Lifetime is depending on the OS version installed or the OS used when the domain/forest was first created and the value was never changed:

    - Windows 2000 with all SPs = 60 Days
    - Windows Server 2003 without SP = 60 Days
    - Windows Server 2003 SP1 = 180 Days
    - Windows Server 2003 R2 SP1, installed with both R2 disks = 60 Days
    - Windows Server 2003 R2 SP1, installed with the 1st R2 disk = 180 Days
    - Windows Server 2003 SP2 = 180 Days
    - Windows Server 2003 R2 SP2 = 180 Days
    - Windows Server 2008 = 180 Days
    - Windows Server 2008 R2 = 180 Days

    Determine the tombstone lifetime for the forest
    http://technet.microsoft.com/en-us/library/cc784932(WS.10).aspx

    Good thread regarding the AD Tombstone:
    Technet Forum: DC offline for 2 months, best way to handle?
    http://social.technet.microsoft.com/Forums/en-US/winserverDS/thread/8c74df53-8042-423c-a801-7a7f38fdde7f

     

     

    Before doing anything...

    Let's first determine if there are any firewall blocks. As Mr X said, PortQry is to the tool to use. Read more on firewall ports and PortQry below. Note - don't bother using nmap or other port scanners, they don't look for specific responses or ports that PortQry looks for with AD.

    Active Directory Firewall Ports - Let's Try To Make This Simple
    http://msmvps.com/blogs/acefekay/archive/2011/11/01/active-directory-firewall-ports-let-s-try-to-make-this-simple.aspx 

     

    Let us know if PortQry finds any ports FILTERED or NOT LISTENING. They are the ones being blocked.

     

     

     


    Ace Fekay
    MVP, MCT, MCITP EA, MCTS Windows 2008 & Exchange 2007 & Exchange 2010, Exchange 2010 Enterprise Administrator, MCSE & MCSA 2003/2000, MCSA Messaging 2003
    Microsoft Certified Trainer
    Microsoft MVP - Directory Services
    Complete List of Technical Blogs: http://www.delawarecountycomputerconsulting.com/technicalblogs.php

    This posting is provided AS-IS with no warranties or guarantees and confers no rights.

    FaceBook Twitter LinkedIn
    Sunday, December 04, 2011 8:40 PM
  • Apart from the other suggestion, try to reset the secure channel on the DC using static domain admin account. Many time "target principal name is incorrect" and logon failure errors are seen due to broken secure channel(DC's aren't able to refresh their password). DC authenticate itself to other dc's to push/pull changes and if the secure channel between the DC's are broken, issue arises.

    netdom resetpwd /server:server2 /userd:<var>mydomain</var>\administrator /passwordd:*

    http://support.microsoft.com/kb/260575

    I agree with Ace, If this is upgrade from windows 2000, the TSL is value is 60 days and it is not automatically changed to 180 days even though you remove last DC as windows 2000 and running g all the DC as windows 2008. If you see system has crossed the TSL period, demote and promote again will be more simpler here.

    http://technet.microsoft.com/en-us/library/dd772723%28WS.10%29.aspx

    Also, few more question

    - How long its not been replicated, can be verifed from the event log or run cmd repadmin /showreps

    -Do necessary ports been configured on the firewall to allow AD sync

    -Security software like antivirus too can block the communication


    Regards  


    Awinish Vishwakarma

    MY BLOG:  awinish.wordpress.com


    This posting is provided AS-IS with no warranties/guarantees and confers no rights.
    Monday, December 05, 2011 9:30 AM
  • Hi All,

    Thanks for your feedback.

    I have started this morning Checking the Ports. There are certain ports that are listed as "Listening or Filtered"and one Port as "Not Listening". I can confirm there is no firewall enabled on the machine and no anti-virus.

    I ran this same check with another site we have, which is set up exactly the same. That site is able to replicate fine and the ports in that site are showing the same status. That leads me to think it might not be the ports?

    This Domain has a functional level of 2008 and was created at that level. I have run the tombstone information command listed above and it has returned 180, so I can rule that one out also.

    I will wait until after hours to make changes to the secure channel as I do not know what effect it will have on the users. I will feed back later on that one.

    The other information you wanted me to provide:

    -From the Command Repadmin /showreps I can see there has been failures since 11/11/11, although it was down weeks before that
    -The firewalls on both ends have had the ports fully opened to test if that was the cause
    -Antivirus has been disabled on both ends. 

    Josh.

    Monday, December 05, 2011 10:54 AM
  • Josh,

    You mentioned the firewall ports were fully opened for testing? Does that mean that they are normally restricted? All those ports need to be wide opened at all times.

    Also, you may want to uninstall the antivirus. Some of them leave settings enabled even if you disable the antivirus. If you don't want to uninstall it, then look for some sort of security setting similar to, "protect network traffic," or something like that. Each one is different. That is the setting that will hurt DC replication, as well as affect client to DC communications. That's what you need to disable.

    Ace

     


    Ace Fekay
    MVP, MCT, MCITP EA, MCTS Windows 2008 & Exchange 2007 & Exchange 2010, Exchange 2010 Enterprise Administrator, MCSE & MCSA 2003/2000, MCSA Messaging 2003
    Microsoft Certified Trainer
    Microsoft MVP - Directory Services
    Complete List of Technical Blogs: http://www.delawarecountycomputerconsulting.com/technicalblogs.php

    This posting is provided AS-IS with no warranties or guarantees and confers no rights.

    FaceBook Twitter LinkedIn
    Monday, December 05, 2011 7:31 PM
  • Hi,

     

    How is everything going? Could you please tell us the present situation? If you need any further assistance, please do not hesitate to respond back.

     

    Thanks!


    Please remember to click “Mark as Answer” on the post that

    Elytis Cheng

    TechNet Community Support

    Wednesday, December 07, 2011 1:42 AM
  • Check the firewall port below port should be open for AD replication.

    Port Assignments for Active Directory Replication
    Service Name   UDP        TCP
    LDAP                   389        389
    LDAP                                 636
    GC                                     3268
    Kerboros              88         88
    DNS                      53         53
    smb over IP          445      445

    Reference:http://technet.microsoft.com/en-us/library/dd772723%28WS.10%29.aspx

    Since while you are acessing the DC UNC path and you are getting target principle name incorrect this indicates that secure channel beetween the DC are broken.
    http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/

    There could also be other reasons refer below link for more details.
    http://social.technet.microsoft.com/Forums/en-US/winserverDS/thread/e9c162cb-1e26-43e0-80df-73c491c22aac/


    Hope this helps

    Regards,
    Sandesh Dubey.
    -------------------------------
    MCSE|MCSA:Messaging|MCTS|MCITP:Enterprise Adminitrator
    My Blog: http://sandeshdubey.wordpress.com
    This posting is provided AS IS with no warranties, and confers no rights.

    Wednesday, December 07, 2011 7:35 AM
  • 1. First try to ping the DC. Second are you able to ping the DSA GUID?If the DC is pingable but GUID is not pinging , see the below link for that.

    http://social.technet.microsoft.com/wiki/contents/articles/how-dcs-find-one-another.aspx

    2. Some ports should be open for AD replication, see the below link for that.

    http://technet.microsoft.com/en-us/library/dd772723(WS.10).aspx  

    3. Use the PortQry for port Troubleshooting. You can use a GUI tool(portqueryui.exe).

    http://support.microsoft.com/kb/310099

    4. Repadmin command for experts, you can use the repadmin for Replication Troubleshooting:

    http://technet.microsoft.com/en-us/library/cc811549(WS.10).aspx

    5. Lingering objects prevent Active Directory replication from occurring

    http://support.microsoft.com/kb/317097


    Best regards Biswajit Biswas Disclaimer: This posting is provided "AS IS" with no warranties or guarantees , and confers no rights. MCP 2003,MCSA 2003, MCSA:M 2003, CCNA, MCTS, Enterprise Admin
    Wednesday, December 07, 2011 1:54 PM
  • Hi All,

    Again, big thanks for all your help.

    I am working through the lists above and just wanted to add some extra notes whilst I am.

    The issue of UNC browsing, is only by name. If I UNC browse to DC1 from StoreDC1, I get the error "The target account name is incorrect", but when I UNC browse via IP address (e.g. \\10.4.1.5) it connects fine.

    Why would this work via IP address but not name? The DNS resolves correctly via ping.

    Saturday, December 17, 2011 3:37 PM
  • Hi All,

    Again, big thanks for all your help.

    I am working through the lists above and just wanted to add some extra notes whilst I am.

    The issue of UNC browsing, is only by name. If I UNC browse to DC1 from StoreDC1, I get the error "The target account name is incorrect", but when I UNC browse via IP address (e.g. \\10.4.1.5) it connects fine.

    Why would this work via IP address but not name? The DNS resolves correctly via ping.

    make sure that NetBIOS over TCP / IP is enabled and that you have the correct DNS suffix.



    This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

    Microsoft Student Partner 2010 / 2011
    Microsoft Certified Professional
    Microsoft Certified Systems Administrator: Security
    Microsoft Certified Systems Engineer: Security
    Microsoft Certified Technology Specialist: Windows Server 2008 Active Directory, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Network Infrastructure, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Applications Infrastructure, Configuration
    Microsoft Certified Technology Specialist: Windows 7, Configuring
    Microsoft Certified Technology Specialist: Designing and Providing Volume Licensing Solutions to Large Organizations
    Microsoft Certified IT Professional: Enterprise Administrator
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Trainer

    Saturday, December 17, 2011 3:41 PM
  • Just did that now. No luck :(
    Saturday, December 17, 2011 4:47 PM
  • Hi All,

    Again, big thanks for all your help.

    I am working through the lists above and just wanted to add some extra notes whilst I am.

    The issue of UNC browsing, is only by name. If I UNC browse to DC1 from StoreDC1, I get the error "The target account name is incorrect", but when I UNC browse via IP address (e.g. \\10.4.1.5) it connects fine.

    Why would this work via IP address but not name? The DNS resolves correctly via ping.


    I would like to point out that if you are getting this error using the computer's name in the UNC path but not with using the IP address, that name resolution with UNCs are based on DirectSMB and NetBIOS. In this case, credentials are based on what NetBIOS has from the Browser service.

    Here's a thread that seems pertinent to what you're seeing (you may need an account to view it, or you can copy the link below, paste it in a Bing or Google search, and click on the first result to get into it):
    http://www.experts-exchange.com/Networking/Windows_Networking/Q_23063424.html

    Since we haven't seen configuration information, such as an ipconfig /all from each DC, any and all event log errors (eventID#s and source names in the event), as well as results for the repadmins, if using WINS (if the old name is in the database), if the DCs are multihomed, RRAS on them, etc, it will be difficult to nail down. This sort of info in many cases will help us come up with a diagnosis, otherwise, we can just offer generalized possibilities.

    If you can elaborate and offer some of this, it may help to come up with a specific diagnosis.


    Ace Fekay
    MVP, MCT, MCITP EA, MCTS Windows 2008 & Exchange 2007 & Exchange 2010, Exchange 2010 Enterprise Administrator, MCSE & MCSA 2003/2000, MCSA Messaging 2003
    Microsoft Certified Trainer
    Microsoft MVP - Directory Services
    Complete List of Technical Blogs: http://www.delawarecountycomputerconsulting.com/technicalblogs.php

    This posting is provided AS-IS with no warranties or guarantees and confers no rights.

    FaceBook Twitter LinkedIn
    Saturday, December 17, 2011 5:28 PM

  • Since while you are acessing the DC UNC path and you are getting target principle name incorrect this indicates that secure channel beetween the DC are broken.

    Reference link:http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/


    This issue typically indicates a Kerberos authentication problem, although there are several exceptions.These steps outline how to resolve the authentication failure.

    To verify the Access this computer from network user right.
    Everyone, Authenticated Users, and Enterprise Domain Controllers must have that user right for successful replication.

    Important: 
    There are customers that are going to remove the Everyone group from Access this computer from the network which is acceptable as long as Authenticated Users and Enterprise Domain Controllers are listed.

    To check the CrashOnAuditFail Registry Key

    Check the <computername>_regentries.txt file in the Directory Services MPSReports to confirm if crashonauditfail [REG_DWORD] = 0x2

    If CrashOnAuditFail = 0x2 perform the following steps

    Type regedit from Start, and then click Run.

    Expand HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\LSA

    Right-click crashonauditfail, select Modify

    Under Value data:, select 2 and change the value to 0

    Reboot domain controller.

    To check the time skew between domain controllers.See Knowledge Base article 257187

    Ensure the Trust computer for delegation check box is selected on the General tab of the domain controller Properties dialog box in Active Directory Users and Computers.

    Using Adsiedit or Ldp (both included in the Windows 2000 Support Tools), confirm that the userAccountControl attribute is set to 532480. To check this, perform the following steps

    Type adsiedit.msc from Start, and then click Run.

    Expand the Domain NC container.

    Expand the object below, i.e. DC=Contoso, DC=COM

    Expand OU=Domain Controllers

    Right-click CN=<domain_controller>, and select Properties

    Under Select a property to view, select userAccountControl and verify the value is 532480

    Note: 
    Check this value for each failing DC account on the local copy of AD for every partner DC. For example if DC-A and DC-B are failing replication, check the above on DC-A’s copy of AD and DC-B’s copy of AD.

    Reset Password and Refresh Kerberos Tickets

    Follow these steps to reset KDC password :-

    1. Stop the Key Distribution Center (KDC) service on Server all Domain controller expect PDC role holder server. To do so, open
    a Command Prompt, type net stop KDC, and press Enter.

    2. Load Kerbtray.exe on problem DC in you case it is Server07. You can do so by clicking Start, clicking Run, and
    then typing c:\program files\resource kit\kerbtray.exe and pressing Enter.You should see a little green ticket icon in your system tray in the lower right corner of your desktop.

    3. Purge the ticket cache on Server7, right-click the green ticket icon in your system tray, and then click Purge Tickets. You should receive a confirmation that your ticket cache was purged. Click OK.

    4. Reset the Server domain controller account password on Server (the PDC
    emulator).

    To do so, open a command prompt and type: netdom /resetpwd /server:server2 /userd:domain.com\administrator /passwordd:password, and then press Enter.

    5. Synchronize the domain. To do so, open a command prompt, type repadmin
    /syncall, and then press Enter.

    6. Start the KDC service on Server7 and all other DC. To do so, open a command prompt, typenet start KDC, and press Enter. This completes the process.

    Reference link:http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/

    Aso ensure the following on DC:
    1. Each DC / DNS server points to its private IP address as primary DNS server and other remote/local DNS servers as secondary in TCP/IP properties.
    2. Each DC has just one IP address and single network adapter is enabled.
    3. Contact your ISP and get valid DNS IPs from them and add it in to the forwarders, Do not set public DNS server in TCP/IP setting of DC.
    4. Once you are done, run "ipconfig /flushdns & ipconfig /registerdns", restart DNS and NETLOGON service each DC.
    Do not put private DNS IP addresses in forwarder list.
    5.Assigning static IP address to DC if IP address is assigned by DHCP server to DC.It is strongly not recommended.


    You can also review the following articles:

    Fixing Replication Connectivity Problems (Event ID 1925)
    http://technet.microsoft.com/en-us/library/cc780728.aspx

    Network connectivity problems can make it impossible for domain controllers to form replication partnerships. Various events and errors can indicate a problem with network connectivity that is preventing replication from occurring.
     
    Troubleshooting Event ID 1311: Knowledge Consistency Checker
    http://support.microsoft.com/kb/214745

    Cause: This behavior can occur if the Knowledge Consistency Checker (KCC) has determined that a site has been orphaned from the replication topology.

    In addition to the articles above, the following articles are useful for troubleshooting AD replication issues:

    Troubleshooting replication
    http://technet.microsoft.com/en-us/library/cc755349.aspx

    How to troubleshoot Event ID 1311 messages on a Windows 2000 domain
    http://support.microsoft.com/kb/307593

    Event ID 1925: Attempt to establish a replication link failed due to connectivity problem
    http://technet.microsoft.com/en-us/library/cc787129.aspx

     Event ID 1311 — KCC Replication Path Computation
    http://technet.microsoft.com/en-us/library/cc756493.aspx

     Hope this helps.

    Regards,
    Sandesh Dubey.
    -------------------------------
    MCSE|MCSA:Messaging|MCTS|MCITP:Enterprise Adminitrator
    My Blog: http://sandeshdubey.wordpress.com
    This posting is provided AS IS with no warranties, and confers no rights.

    Sunday, December 18, 2011 4:03 AM
  • Thanks everyone for all your help.

    I was able to resolve this issue by repairing the secure channel that was broken between the DC's.

    http://sandeshdubey.wordpress.com/2011/10/02/secure-channel-between-the-dcs-broken/

    This was a very simple fix we performed after hours. After this fix, a repadmin /syncall did not seem to push and pull all replication, but the next day everything had replicated perfectly.

    Just a few notes if anyone else is experiencing this same issue:

    • I could UNC browse by IP address, but not by name from the remote server to the central server
    • I could UNC browse by IP address and name from the central server to the remote server
    • For me, this error started spreading and effecting the remote site users so we needed to resolve ASAP
    • This issue highlighted another issue in sites and services with regards to replication topology. As we have a hub and spoke network, all sites were trying to replicate between each other even though the site links were specified. Later on today we will be unticking the box 'Bridge All Site Links'
    • We have Hamanchi on each DC and so far this has not caused any replication/domain issues we are aware of.
    • Marked as answer by JoshGoldman Tuesday, January 03, 2012 9:43 AM
    Tuesday, January 03, 2012 9:43 AM