none
2008 R2 SP1 DFS Initial Replication stalls out about half way.

    Question

  • I'm trying to setup some DFS shares to replicate over a VPN. I started out with a small share consisting of 500MB and about 500 files and this worked as expected. I defined one server that had all the files to be the Primary Member, and set up a full mesh topology and scheduled it to use the full bandwidth. This successfully completed the initial replication.

    I next setup a slightly larger share with approx. 2GB and 2,000 files. I set the server with the files to be the Primary Member. I set the Staging area to be 16GB. In both cases, the server that was not the primary initially had no files. After replicating about 809 files, it seems to have stalled. But, I'm not getting any error messages in the event log. I've tried to manually restart the replication with no effect.

    Both servers are Windows Server 2008 R2 SP1 VMs setup as file servers with a domain functional level of 2008 R2. DNS and AD are provided by other servers. In all other respects, the servers seem to be communicating successfully through the VPN.

    Thanks, Carl

    Monday, July 09, 2012 5:28 PM

Answers

All replies

  • Hello Carl,

    I will start by checking network related issues. does your network have a data quota?


    Isaac Oben MCITP:EA, MCSE,MCC View my MCP Certifications

    Monday, July 09, 2012 7:49 PM
  • Hi Isaac,

    No, our network does not have any quotas. I suppose there is a maximum limit to our ISP accounts. But, all the other network activity continues normally. I can remote in to both ends and manually copy files. I originally set the replication schedule to go full overnight and then to drop to 16kbps during the day. If I set the Bandwidth usage to Full, I see a slowdown in the network response time, as I would expect. I've been experimenting with various settings for the bandwidth usage and the Staging area, including increasing the staging area to 20GB and to move it onto a separate physical disks.

    Our users are currently on holidays and our office is shutdown, so I have the network to myself. That is, there is nothing else running that would impact the WAN. It has been stuck on 809 files for several days. I could restart, but I'm trying to understand why this would even do this.

    I'm guessing that I'll just have to back away from my plan of using the DFS-R to replicate between offices.

    Thanks, Carl

    Monday, July 09, 2012 9:33 PM
  • Hello Carl,

    Just to understand, you are replicating just 2GB of data? And your staging size quota is 20GB? You shouldn't have that much issue with a 2 GB of data. Any error/warning on the event logs?


    Isaac Oben MCITP:EA, MCSE,MCC View my MCP Certifications

    Monday, July 09, 2012 9:50 PM
  • Hi Isaac,

    Yes, I agree 2GB should just be a burp; it should not be having problems.

    I rechecked the logs and there are Error 5014 The DFS Replication service is stopping communication with partner SERVER1 for replication due to an error. This is followed by 5004 The DFS Replication service successfully established an inbound connection with partner SERVER1. These just keep repeating every 7 or 8 minutes. This is mirrored on the other side with the other server name.

    I now see that these started shortly after I created the first replication pair. It says under Additional Information: Error:1726 (The remote procedure call failed.)

    I don't see any DNS/AD errors in the event logs. The WAN link is a permanent VPN with static IPs, but it may be something buggered with the networking.

    Thanks, Carl

    Tuesday, July 10, 2012 12:19 AM
  • Carl,

    How is you VPN setup? I am thinking that your connection is on and off. You can test your vpn connection settings etc


    Isaac Oben MCITP:EA, MCSE,MCC View my MCP Certifications

    Tuesday, July 10, 2012 9:33 PM
  • Hi Isaac,

    The VPN is handled at the router hardware level. As soon as the routers boot, they reconnect to each other using static IPs. These are gigabit routers and the servers are connected via Cat 6 cables and gigabit nics. There is a physical server on each side of the VPN that handles the AD and DNS. The file servers themselves are VMs. If I disable the DFS replication on the file servers. All the event logs are clean, as in no errors regarding communication between the AD servers through the VPN. As soon as I try enable a DFS replication pair, then I get all those errors. There could be an misconfiguration in the virtual network, but all the servers can ping each other name and I cannot see any obvious issue.

    Thanks for your assistance, I do appreciate it. I'm going to leave this for now, as I have other duties pressing.

    Thanks again.

    Wednesday, July 11, 2012 4:58 PM
  • Hi Carl,

    As you mentioned error 1726, have a try on the steps mentioned in this thread:

    DFS Replication failing for large files

    http://social.technet.microsoft.com/Forums/en-US/winserverfiles/thread/93993c45-5361-4f04-a60c-1863a7ec7a4c


    TechNet Subscriber Support in forum |If you have any feedback on our support, please contact tnmff@microsoft.com.

    • Marked as answer by Carl Mudryk Thursday, August 02, 2012 3:26 PM
    Monday, July 16, 2012 7:52 AM
  • Hi Shaon,

    I finally got another chance to look at this and I tried the suggestions in the thread you mentioned, and success... more or less. I changed the MTU setting for the TCP/IP on the servers and routers to 1472 (1500-28). It seems to have resolved most of the problems and I was able to replicate a test folder. The DFS log on the Primary server is 100% clean, no errors. On the non-primary server, I still saw the 1726 errors during the initial replication, but after that, none. The folder appears to be staying in sync.

    I'm going to try add some more folder replication pairs and see what happens. I'll report back.

    Thanks, Carl

    Thursday, August 02, 2012 12:43 AM
  • Hi Shaon,

    I added another pair of folders for replication and let it run overnight and both servers ran without errors. I'm thinking this is fixed.

    Thank you all for your assistance.

    Carl

    Thursday, August 02, 2012 3:42 PM