none
Distribution Agent randomly failing for SQL 2008 replication

    Pregunta

  • I seem to have a problem with SQL 2008 replication, I have a publisher/distributor at Site A and its replicating to a subscriber at Site B, from time to time I see
    the distribution agents failing for soem reason, even when its not replicating data, I even see the distribution agents for schema also failing as well. As its schema,
    its not actually actively transferring data at all, but it still fails. A restart of the agent job normally fixes the problem. I have the verbose log which can be
    found below:

    2010-02-15 16:13:29.278 OLE DB Distributor 'PUBLISHER_DISTRIBUTOR': {call sp_MSget_subscription_guid(31)}
    2010-02-15 16:13:29.294 sp_MSget_repl_commands timestamp  value is: 0x0x000122d30000345a000100000000
    2010-02-15 16:13:48.247 Agent message code 121. TCP Provider: The semaphore timeout period has expired.
    2010-02-15 16:13:48.247 ErrorId = 1475, SourceTypeId = 0
    ErrorCode = '0'
    ErrorText = 'TCP Provider: The semaphore timeout period has expired.'
    2010-02-15 16:13:48.263 Adding alert to msdb..sysreplicationalerts: ErrorId = 1475,
    Transaction Seqno = 0000000000000000000000000000, Command ID = -1
    Message: Replication-Replication Distribution Subsystem: agent PUBLISHER_DISTRIBUTOR-SUBSCRIBER scheduled for retry. TCP Provider: The semaphore timeout period has expired.ErrorId = 1475, SourceTypeId = 0
    ErrorCode = '121'
    ErrorText = 'TCP Provider: The semaphore timeout period has expired.'
    2010-02-15 16:13:48.278 Category:NULL
    Source:  Microsoft SQL Server Native Client 10.0
    Number:  121
    Message: TCP Provider: The semaphore timeout period has expired.
    2010-02-15 16:13:48.278 ErrorId = 1475, SourceTypeId = 0
    ErrorCode = '121'
    ErrorText = 'Communication link failure'
    2010-02-15 16:13:48.294 Category:NULL
    Source:  Microsoft SQL Server Native Client 10.0
    Number:  121
    Message: Communication link failure
    
    
    martes, 16 de febrero de 2010 12:44

Todas las respuestas

  • This is symptomatic of a congested/lossy network connection.
    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    martes, 16 de febrero de 2010 16:20
  • Try by changing the BcpBatchSize  value to 10000 by defaut this is 100000. Netwrok may not be able to handle that much

    http://msdn.microsoft.com/en-us/library/ms190313.aspx
    Blog: http://dineshasanka.spaces.live.com
    miércoles, 17 de febrero de 2010 8:46
  • Thanks for the reply, I have a question, how about distribution agents that dont transfer data but meant for only schema, so these agents should only be doing work when the snapshot is reinitialised, this is when the objects such as stored procedure should be transferred to the subscribers, these agents are also failing as well. I would have thought decreasing the batch size doesnt make any effect here, please explain.

    Also, can anything be doen to minimise the effect of lossy networks other then reducing the batch size ?
    miércoles, 17 de febrero de 2010 10:06
  • Is there a reason why the alerts would not fire if this error is occurring.  Not getting replication alerts is making my customer nervous.  For that matter, me too.
    miércoles, 21 de abril de 2010 23:49