Article addition causes severe locking,High waiting task in SQL server 2016 Transactional Replication


  • Problem Description:

    Unusual High Waiting Tasks & Sudden dip in CPU usage with lots of locking while try to add an article Table(s) in an existing Transactional Replication publication which was either initiated from snapshot or initiated from Backup in SQL server 2016 Enterprise edition, it was further observed that articles don’t get replicated after successful generation of snapshot if the distribution agent is in running condition during snapshot generation.

    Environment details:

    1. 24 cores of physical machines.
    2. Issue experience on both version of SQL mentioned below
      1. Microsoft SQL Server 2016 (SP1) (KB3182545) - 13.0.4001.0 (X64)   Oct 28 2016 18:17:30   Copyright (c) Microsoft Corporation Enterprise Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600:)
      2. Microsoft SQL Server 2016 (SP1-CU7-GDR) (KB4057119) - 13.0.4466.4 (X64) Enterprise Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600)
    3. SQL Server detected 4 sockets with 6 cores per socket and 6 logical processors per socket, 24 total logical processors; using 20 logical processors based on SQL Server licensing


    exec sp_helppublication ‘Publication Name’

    immediate_sync bit         allow_anonymous bit

    0                                                                     0

    Database size:

     512 GB

    Replicated Articles size:

    Problem arises irrespective of size of replicated article

    ------- SpWho2  'Active' Result during snapshot generation

    Culprit ID : 6902

    6902      SUSPENDED     DBNAME (XXX)       UNKNOWN TOKEN       10142    1455       10/20 10:05:02  Publication Name(XXXXXX\XXXXXX-XXXX-PUB__XXX__REPORTING__ALL__TAB-8)

    ------- DBCC INPUTBUFFER (6902)


    -------- Individual Culprit Query

    --EXEC %%Object(MultiName = @qualified_name).LockMatchID(ID = @object_id, Exclusive = 1, BindInternal = 0) 

    -------- Parent Procedure

     --create procedure sys.sp_MSsetfilteredstatus @object_id int, @forceset bit = 0  as      declare @qualified_name nvarchar(517)        ,@retcode int        ,@setFlag bit    ,@pkkey sysname    ,@source_table sysname    ,@fcdc_tracked bit          exec @retcode = sys.sp_MSreplcheck_publish      if @@ERROR <> 0 or @retcode <> 0          return(1)        exec sys.sp_MSget_qualified_name @object_id, @qualified_name output        BEGIN TRANSACTION   declare @cdc_tracked_tables table (object_id int)      if object_id('cdc.change_tables') is not null   begin    insert @cdc_tracked_tables select distinct source_object_id from cdc.change_tables   end   if (@@rowcount = 0)    set @fcdc_tracked = 0   else    set @fcdc_tracked = 1        if not (@qualified_name is null)      begin          EXEC %%Object(MultiName = @qualified_name).LockMatchID(ID = @object_id, Exclusive = 1, BindInternal = 0)          if @@error <> 0              select @qualified_name = null      end        if @qualified_name is null      begin          raiserror(18750, -1, -1, 'sp_MSsetfilteredstatus', 'object_id');          goto error_abort_exit      end        if exists( select * from sys.objects where type = 'RF' and parent_object_id = @object_id )        or exists( select * from sysarticles A join syspublications P on A.pubid = P.pubid where A.objid = @object_id and (P.allow_queued_tran = 1 or P.allow_sync_tran = 1))        or exists( select * from sysarticles where objid = @object_id and (upper(upd_cmd) like 'CALL%' OR upper(upd_cmd) like 'XCALL%' ) )        or exists( select * from sysarticles where objid = @object_id and (upper(del_cmd) like 'XCALL%' ) )        or exists( select * from sysarticles sa, syssubscriptions ss where sa.objid = @object_id and sa.artid = ss.artid and ss.status = 3)        or @forceset = 1      begin          EXEC %%Relation(ID = @object_id).SetHasFilterProc(Value = 1)       set @setFlag = 1      end      else       begin    if (@fcdc_tracked = 0)          EXEC %%Relation(ID = @object_id).SetHasFilterProc(Value = 0)       set @setFlag = 0      end     --set/reset IsLoggedForRepl for all replicated column (except legacy blobs) depand on if has_replication_filter is set   --this controls logging behavior of offrow columns in deletes      --don't clear the bits if table is still being tracked by cdc   if(@setFlag = 1 or @fcdc_tracked = 0)   begin    declare #hpk cursor local fast_forward for     select name from sys.columns where object_id = @object_id and is_replicated = 1        and system_type_id not in (34, 35, 99) --image, text. ntext       and user_type_id not in  (34, 35, 99) --image, text. ntext    open #hpk    fetch #hpk into @pkkey    while (@@fetch_status = 0)    begin     EXEC %%ColumnEx(ObjectID = @object_id, Name = @pkkey).SetLogForRepl(Value = @setFlag)     IF @@ERROR <> 0     BEGIN      select @source_table = object_name(@object_id)      RAISERROR(15052, -1, -1, @source_table)      return 1     END          fetch #hpk into @pkkey    end    close #hpk    deallocate #hpk              --in case of turning of has_replicate_filter, make sure IsLoggedForRepl is still set for base of pk/computed pk columns    if(@setFlag = 0)    begin     exec @retcode = sp_MSSetLogForRepl @object_id        if @@ERROR <> 0 or @retcode <> 0            return(1)    end   end      COMMIT TRANSACTION      return (0)    error_abort_exit:      IF @@TRANCOUNT = 1          ROLLBACK TRAN      ELSE COMMIT TRAN      return (1)       

    Snaps during problem

    • Body text cannot contain images or links until we are able to verify your account.

    Monday, April 16, 2018 6:55 AM

All replies

  • Can you shut down the log reader agent when you run this command to see if this solves the problem?

    Monday, April 16, 2018 2:47 PM
  • Also what sync_type is your publication? We noticed a way more locking problem when using concurrent (default one). We changed it to database snapshot which solved locking (in some degree), however we need to stop distribution agents before running the snapshot, otherwise snapshot is not picked up by distribution agents.

    Monday, April 23, 2018 4:29 PM
  • Dear Hilary,

           Thanks for your reply , we have shutdown the Log reader agent while generating snapshot of newly added tables in already sync publication, but it does not solve the problem as scenario repeats as mentioned in original post.

    Monday, April 30, 2018 5:09 AM
  • Are you doing your sp_addarticle via script or via the GUI. You need to do it via script.

    Did you also do a sp_refreshsubscriptions after adding the article?

    Monday, April 30, 2018 3:08 PM
  • Dear Hilary,

            Thanks for your valuable feedback,Previously we were adding article using GUI and script (without using sp_refreshsubscriptions due to which snapshot was not picking the newly added article) both . Now we have also tried to add article on the said publication by using sp_addarticle and sp_refreshsubscriptions but same scnerio/problem repeats as we generate the snapshot (This time snapshot picked the newly added article but it doesn't fix the given issue yet)


    Friday, May 4, 2018 7:34 AM
  • I normally shut down the log reader agent when I add an article through sp_addarticle.

    Having the snapshot agent running concurrently will also be problematic.

    Do you still see a lot of contention when you run this with sp_addarticle?

    Can you gather the blocked process report while running this to see what is blocking? Ensure you set your blocked process threshold (s) to perhaps 10s?

    Friday, May 4, 2018 1:50 PM