locked
DFSR Initial Sync Extremely slow RRS feed

  • Question

  • We have an environment with two Windows 2008 R2 servers with four replication groups configured. Both servers are on the same subnet and in the same AD site. They are running on VMWare 5.0 with each server on a different host.

    Replication has been running OK for about a year on these servers. Five days ago, I noticed that replication had stopped processing across both servers in all of the replication groups. I restarted the DFS replication service and replication started processing on three of the RGs, but the fourth was still not processing. After reviewing the logs, I determined that it looked like the DFS database for that RG was corrupted, so I deleted the RG and recreated it. I have been monitoring the DFS backlogs on the server ever since, and it has only been growing.

    The folder in this RG is 690 GB in size and contains 1,028,659 files. When I review the debug logs on each server, it appears that the primary server is still in the process of reading files into the database. The downstream server is repeatedly getting messages about the content set not being ready. I've been watching the DFSR backlog with the Get-DFSBacklog.ps1 script found here: http://gallery.technet.microsoft.com/scriptcenter/dac62790-219d-4325-a57b-e79c2aa6b58e. The current backlog shows up as:

    Sending Server Receiving Server Backlog Count

    ACCDATA3 LOCALHOST$ 1097161
    LOCALHOST ACCDATA3$ 1137821

    The ACCData3 server is the primary server in the initial sync, and the backlog for ACCData3 as the sending server has been growing steadily all week.

    When I create a diagnostic report using the DFS Management console, I get the error:

    Description: The DFS Replication service has been repeatedly prevented from getting folder information due to consistent sharing violations encountered on a folder. The service is unable to replicate the folder and files in that folder until the sharing violation is resolved. Event ID: 4312 
      Last occurred: Friday, October 05, 2012 at 10:01:50 AM (GMT-6:00) 

    There is also a warning for ACCData1 that says:

    Description: This member is waiting for initial replication for replicated folder Public and is not currently participating in replication. This delay can occur because the member is waiting for the DFS Replication service to retrieve replication settings from Active Directory Domain Services. After the member detects that it is part of replication group, the member will begin initial replication.  
      Last occurred: Friday, October 05, 2012 at 12:36:35 PM (GMT-6:00) 

    Looking in the Event Logs, I don't see any errors, except the 4312 at the time listed above. The InitialSync process has been running since Monday morning. I restarted the DFSR service this morning to see if that would help anything, and the Event 4312 appeared shortly after restarting the service.

    This is a production file server, and the folder that's not replicating is the most heavily used on the server. The other RGs each have a significantly smaller number of files in them, and their replication has been working with no problems since I restarted the service on Monday.

    Here is a some of the most recent debug logs:

    Debug Log from ACCDATA1 which is not the primary server for the initial replication.


    20121005 13:25:24.224 2208 DOWN  4186 [ERROR] DownstreamTransport::EstablishSession Failed on connId:{CE775A87-865E-4C79-ABAD-5EF7F5A909EC} csId:{85711A7F-06A5-46E2-9AC8-6594CA71B90E} rgName:accuval.org\shares\public Error:
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C A failure was reported by the remote partner]
    + [Error:9051(0x235b) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C The content set is not ready]
    20121005 13:25:24.224 2208 INCO  7487 InConnection::RestartSession Retrying establish contentset session. connId:{CE775A87-865E-4C79-ABAD-5EF7F5A909EC} csId:{85711A7F-06A5-46E2-9AC8-6594CA71B90E} csName:Public
    20121005 13:25:24.224 2208 INCO  1042 [WARN] SessionTask::Step (Ignored) Failed, should have already been processed. Error:
    + [Error:9027(0x2343) InConnection::EstablishSession inconnection.cpp:6172 2208 C A failure was reported by the remote partner]
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4200 2208 C A failure was reported by the remote partner]
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C A failure was reported by the remote partner]
    + [Error:9051(0x235b) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C The content set is not ready]
    20121005 13:25:24.224 2208 ISYN    68 InitialSyncManager::ReturnToken InitialSync sync step not finished yet. Wake up other session tasks.
    20121005 13:25:33.397 3492 USNC  2450 UsnConsumer::UpdateIdRecord LDB Updating ID Record:
    + fid                             0x3100000009979B
    + usn                             0xbad177aa0
    + uidVisible                      0
    + filtered                        0
    + journalWrapped                  0
    + slowRecoverCheck                0
    + pendingTombstone                0
    + internalUpdate                  0
    + dirtyShutdownMismatch           0
    + meetInstallUpdate               0
    + meetReanimated                  0
    + recUpdateTime                   20121004 01:31:06.742 GMT
    + present                         1
    + nameConflict                    0
    + attributes                      0x20
    + ghostedHeader                   0
    + data                            0
    + gvsn                            {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121030
    + uid                             {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33214915
    + parent                          {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33214432
    + fence                           Default (3)
    + clockDecrementedInDirtyShutdown 0
    + clock                           20121005 18:25:33.381 GMT (0x1cda326ce98c91c)
    + createTime                      20120524 20:05:09.012 GMT
    + csId                            {85711A7F-06A5-46E2-9AC8-6594CA71B90E}
    + hash                            00000000-00000000-00000000-00000000
    + similarity                      00000000-00000000-00000000-00000000
    + name                            ACC-W7P-R9NYY59.accuval.org.000.log
    +
    20121005 13:25:33.397 3492 USNC  2453 UsnConsumer::UpdateIdRecord ID record updated from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        136
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x3100000009979B
    + ParentFileRefNumber: 0x100000012F86C
    + USN:                 0xbad177aa0
    + TimeStamp:           20121005 13:25:33.381 Central Standard Time
    + Reason:              Close Data Extend Data truncation 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      70
    + FileNameOffset:      60
    + FileName:            ACC-W7P-R9NYY59.accuval.org.000.log
    +
    20121005 13:25:37.484 3492 USNC  2703 UsnConsumer::CreateNewRecord LDB Inserting ID Record:
    + fid                             0x42000000194C36
    + usn                             0xbad177b70
    + uidVisible                      0
    + filtered                        0
    + journalWrapped                  0
    + slowRecoverCheck                0
    + pendingTombstone                0
    + internalUpdate                  0
    + dirtyShutdownMismatch           0
    + meetInstallUpdate               0
    + meetReanimated                  0
    + recUpdateTime                   16010101 00:00:00.000 GMT
    + present                         1
    + nameConflict                    0
    + attributes                      0x20
    + ghostedHeader                   0
    + data                            0
    + gvsn                            {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    + uid                             {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    + parent                          {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33142030
    + fence                           Default (3)
    + clockDecrementedInDirtyShutdown 0
    + clock                           20121005 18:25:37.468 GMT (0x1cda326d1087056)
    + createTime                      20121005 18:25:37.468 GMT
    + csId                            {85711A7F-06A5-46E2-9AC8-6594CA71B90E}
    + hash                            00000000-00000000-00000000-00000000
    + similarity                      00000000-00000000-00000000-00000000
    + name                            test
    +
    20121005 13:25:37.484 3492 USNC  2706 UsnConsumer::CreateNewRecord ID record created from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        72
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x42000000194C36
    + ParentFileRefNumber: 0x20000001BEAFF
    + USN:                 0xbad177b70
    + TimeStamp:           20121005 13:25:37.468 Central Standard Time
    + Reason:              Close File Create 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      8
    + FileNameOffset:      60
    + FileName:            test
    +
    20121005 13:25:37.484 3492 USNC  2777 UsnConsumer::TombstoneOrDelete LDB Deleting ID Record. uid:{D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    20121005 13:25:37.484 3492 USNC  2783 UsnConsumer::TombstoneOrDelete ID record deleted from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        72
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x42000000194C36
    + ParentFileRefNumber: 0x20000001BEAFF
    + USN:                 0xbad177bb8
    + TimeStamp:           20121005 13:25:37.468 Central Standard Time
    + Reason:              Close File Delete 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      8
    + FileNameOffset:      60
    + FileName:            test

    Debug Log snipit from ACCData3 (Configured as primary server for the initial sync)


    20121005 13:25:24.224 2208 DOWN  4186 [ERROR] DownstreamTransport::EstablishSession Failed on connId:{CE775A87-865E-4C79-ABAD-5EF7F5A909EC} csId:{85711A7F-06A5-46E2-9AC8-6594CA71B90E} rgName:accuval.org\shares\public Error:
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C A failure was reported by the remote partner]
    + [Error:9051(0x235b) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C The content set is not ready]
    20121005 13:25:24.224 2208 INCO  7487 InConnection::RestartSession Retrying establish contentset session. connId:{CE775A87-865E-4C79-ABAD-5EF7F5A909EC} csId:{85711A7F-06A5-46E2-9AC8-6594CA71B90E} csName:Public
    20121005 13:25:24.224 2208 INCO  1042 [WARN] SessionTask::Step (Ignored) Failed, should have already been processed. Error:
    + [Error:9027(0x2343) InConnection::EstablishSession inconnection.cpp:6172 2208 C A failure was reported by the remote partner]
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4200 2208 C A failure was reported by the remote partner]
    + [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C A failure was reported by the remote partner]
    + [Error:9051(0x235b) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 2208 C The content set is not ready]
    20121005 13:25:24.224 2208 ISYN    68 InitialSyncManager::ReturnToken InitialSync sync step not finished yet. Wake up other session tasks.
    20121005 13:25:33.397 3492 USNC  2450 UsnConsumer::UpdateIdRecord LDB Updating ID Record:
    + fid                             0x3100000009979B
    + usn                             0xbad177aa0
    + uidVisible                      0
    + filtered                        0
    + journalWrapped                  0
    + slowRecoverCheck                0
    + pendingTombstone                0
    + internalUpdate                  0
    + dirtyShutdownMismatch           0
    + meetInstallUpdate               0
    + meetReanimated                  0
    + recUpdateTime                   20121004 01:31:06.742 GMT
    + present                         1
    + nameConflict                    0
    + attributes                      0x20
    + ghostedHeader                   0
    + data                            0
    + gvsn                            {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121030
    + uid                             {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33214915
    + parent                          {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33214432
    + fence                           Default (3)
    + clockDecrementedInDirtyShutdown 0
    + clock                           20121005 18:25:33.381 GMT (0x1cda326ce98c91c)
    + createTime                      20120524 20:05:09.012 GMT
    + csId                            {85711A7F-06A5-46E2-9AC8-6594CA71B90E}
    + hash                            00000000-00000000-00000000-00000000
    + similarity                      00000000-00000000-00000000-00000000
    + name                            ACC-W7P-R9NYY59.accuval.org.000.log
    +
    20121005 13:25:33.397 3492 USNC  2453 UsnConsumer::UpdateIdRecord ID record updated from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        136
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x3100000009979B
    + ParentFileRefNumber: 0x100000012F86C
    + USN:                 0xbad177aa0
    + TimeStamp:           20121005 13:25:33.381 Central Standard Time
    + Reason:              Close Data Extend Data truncation 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      70
    + FileNameOffset:      60
    + FileName:            ACC-W7P-R9NYY59.accuval.org.000.log
    +
    20121005 13:25:37.484 3492 USNC  2703 UsnConsumer::CreateNewRecord LDB Inserting ID Record:
    + fid                             0x42000000194C36
    + usn                             0xbad177b70
    + uidVisible                      0
    + filtered                        0
    + journalWrapped                  0
    + slowRecoverCheck                0
    + pendingTombstone                0
    + internalUpdate                  0
    + dirtyShutdownMismatch           0
    + meetInstallUpdate               0
    + meetReanimated                  0
    + recUpdateTime                   16010101 00:00:00.000 GMT
    + present                         1
    + nameConflict                    0
    + attributes                      0x20
    + ghostedHeader                   0
    + data                            0
    + gvsn                            {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    + uid                             {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    + parent                          {D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v33142030
    + fence                           Default (3)
    + clockDecrementedInDirtyShutdown 0
    + clock                           20121005 18:25:37.468 GMT (0x1cda326d1087056)
    + createTime                      20121005 18:25:37.468 GMT
    + csId                            {85711A7F-06A5-46E2-9AC8-6594CA71B90E}
    + hash                            00000000-00000000-00000000-00000000
    + similarity                      00000000-00000000-00000000-00000000
    + name                            test
    +
    20121005 13:25:37.484 3492 USNC  2706 UsnConsumer::CreateNewRecord ID record created from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        72
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x42000000194C36
    + ParentFileRefNumber: 0x20000001BEAFF
    + USN:                 0xbad177b70
    + TimeStamp:           20121005 13:25:37.468 Central Standard Time
    + Reason:              Close File Create 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      8
    + FileNameOffset:      60
    + FileName:            test
    +
    20121005 13:25:37.484 3492 USNC  2777 UsnConsumer::TombstoneOrDelete LDB Deleting ID Record. uid:{D0507076-6F1E-4064-BB3A-FEC611DB04A9}-v34121031
    20121005 13:25:37.484 3492 USNC  2783 UsnConsumer::TombstoneOrDelete ID record deleted from USN_RECORD:
    + USN_RECORD:
    + RecordLength:        72
    + MajorVersion:        2
    + MinorVersion:        0
    + FileRefNumber:       0x42000000194C36
    + ParentFileRefNumber: 0x20000001BEAFF
    + USN:                 0xbad177bb8
    + TimeStamp:           20121005 13:25:37.468 Central Standard Time
    + Reason:              Close File Delete 
    + SourceInfo:          0x0
    + SecurityId:          0x0
    + FileAttributes:      0x20
    + FileNameLength:      8
    + FileNameOffset:      60
    + FileName:            test

    I'm new to reading these debug logs, and DFS replication, so I would appreciate some assistance on seeing if this is indeed configured properly, and verifying that the sync process is not hung somewhere. 

    If there is any other information I could provide to make this easier, please let me know. 

    Friday, October 5, 2012 6:35 PM

Answers

  • After a reboot of both servers on the weekend, the backlog suddenly started processing pretty quickly and the 1.1 million items in the backlog were processed in about 16 hours. Now my backlogs show as 0.
    • Marked as answer by Mitch Jurisch Monday, October 8, 2012 1:16 PM
    Monday, October 8, 2012 1:16 PM

All replies

  • I forgot to mention in my original post that all DFS hotfixes from this KB article have been installed on both servers.
    Friday, October 5, 2012 7:33 PM
  • After a reboot of both servers on the weekend, the backlog suddenly started processing pretty quickly and the 1.1 million items in the backlog were processed in about 16 hours. Now my backlogs show as 0.
    • Marked as answer by Mitch Jurisch Monday, October 8, 2012 1:16 PM
    Monday, October 8, 2012 1:16 PM