none
DFS Replication Issues

    Question

  • Having an issue with a replication group that is stuck in awaiting initial replication state.  A second replication group is working without issue. 

    I have ran DCDiag and everything is communicating correctly.

    So next I ran:

    Dfsradmin Membership Set /RGName:Corporate /RFName:Corporate_Share /MemName:SERVER1 /IsPrimary:True

    Dfsrdiag Pollad /Member:SERVER1
    Dfsrdiag Pollad /Member:SERVER2

    I then tried to force the replication with:

    Dfsrdiag syncnow /rgname:Corporate /partner:SERVER1 /time:1000 /verbose

    I went to the event log (on SERVER2 with the sync issues) to see what was getting logged at that point.

    Every three to four minutes I get this message:

    The DFS Replication service is stopping communication with partner SERVER1 for replication group Corporate due to an error. The service will retry the connection periodically. 
    Additional Information:
    Error: 1726 (The remote procedure call failed.)

    One to two seconds later this event appears:

    The DFS Replication service successfully established an inbound connection with partner SERVER1 for replication group Corporate. 
    Additional Information:
    Connection Address Used: SERVER1

    At that point I was thinking it might be the local ISP dropping out, so I went and dropped a test file on the group that is working on SERVER1, within a minute it appears on SERVER2.  I then go to SERVER2 and delete the file, within a minute or two the files is removed from SERVER1

    I go back to the DFS report and re-run it, and still shows SERVER1 Corporate group awaiting initial sync.

    I have three total servers in this replication group, No Issues with SERVER1 and SERVER3 replicating, and one replication group on SERVER2 is replicating, but the Corporate Replication group is not.

    Any one have thoughts?  Ideas on how to further trouble shoot?

    Thanks for your help and input.


    Cheers,
    Curt Winter
    Certified Microsoft Professional
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied. If you found my post helpful, please mark it as the answer.

    Wednesday, June 13, 2018 5:44 PM

Answers

  • Ok it ended up that there was a large backlog it needed to work through, it was actually doing the initial Replication, but it had over 200,000 files in the backlog.  Once it worked through the backlog, the replication started to work real time.

    Thank you for the ideas, just wanted to reply with where we ended up.

    Using the DFS health report we started to track the backlog numbers and that is how we figured out is was actually doing something.


    Cheers,
    Curt Winter
    Certified Microsoft Professional
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied. If you found my post helpful, please mark it as the answer.

    • Marked as answer by Curt Winter Friday, June 22, 2018 3:19 PM
    Friday, June 22, 2018 3:19 PM

All replies

  • Hi Curt,

    According to the error message, it's more like to be the connection problem between the two dfs server.

    Error: 1726 (The remote procedure call failed.) " points to firewall or network connectivity issues.

    Could you please disable the AV software and firewall on both server1 and server 2?

    Please also updating NIC driver to the latest version

    You could also refer to the thread discussed before.

    https://social.technet.microsoft.com/Forums/ie/en-US/68c4f402-6c77-4388-9701-51a4fc112964/error-1726-the-remote-procedure-call-failed-every-7-minutes-dfsr-backlogs?forum=winserverDS

    In addition according to your description, only a dfs replication group affected by this. You could also consider to delete this dfs replication group. Create a new dfs replication add dfs target. Then check if the replication finishes.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Thursday, June 14, 2018 3:16 AM
    Moderator
  • Mary,

    Thanks for the link and the ideas, The firewalls are already off, will look at AV software and the routers.

    NIC update is a good idea also.

    Curt


    Cheers,
    Curt Winter
    Certified Microsoft Professional
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied. If you found my post helpful, please mark it as the answer.

    Thursday, June 14, 2018 2:10 PM
  • Hi,

    If you have the updates please feel free to contact us.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, June 15, 2018 1:30 AM
    Moderator
  • Ok it ended up that there was a large backlog it needed to work through, it was actually doing the initial Replication, but it had over 200,000 files in the backlog.  Once it worked through the backlog, the replication started to work real time.

    Thank you for the ideas, just wanted to reply with where we ended up.

    Using the DFS health report we started to track the backlog numbers and that is how we figured out is was actually doing something.


    Cheers,
    Curt Winter
    Certified Microsoft Professional
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied. If you found my post helpful, please mark it as the answer.

    • Marked as answer by Curt Winter Friday, June 22, 2018 3:19 PM
    Friday, June 22, 2018 3:19 PM