none
Shutdown Time-Out Interval for Transactional Replication RRS feed

  • Question

  • Hello.

    I have noticed that my Transactional Replication started breaking down for the reasons I still cannot explain other than there is some kind of (auto) restart of my DISTRIBUTION Server (it's a dedicated DIST Server, so not the same as the Publishing Server)...but could also be that the Infrastructure guys are doing something to DNS etc. that I am not aware of.

    I started to explore this failure, and came across the fact (apart from the fact that the SQL Server Agent on my DIST Server "mysteriously" gets stopped once in a while) that SQL Server Agent > Properties > Job System > Shutdown time-out interval parameter is set to 15 seconds.

    Since I have no experience with what this Parameter is in charge of, please advise whether, in the case of Transactional Replication, this should be turned off (if possible at all) or otherwise put to as high a value as possible (due to the fact that LogReader and Distribution Agents should "never" stop executing)?

    (please do not paste the official declaration of what this Parameter is in charge of, since I have already read this - I am interested in advice/help based on experience)

    With thanks and kind regards,

    Bogdan


    BogdanU

    Thursday, October 10, 2019 1:04 PM

Answers

  • Hello, Tom.

    Thank you for the fast and helpful answer (especially for the tip about having the Replication on a schedule instead of continuous - I have not given this a thought at all).

    The issue I was having with Replication seemed to have come from the change in Linked Server parameters and the changes that the infrastructure colleagues have been implementing. 

    I would like to use this opportunity to ask you another thing, please:

    Now that I have a SUB DB which is grossly out of sync with reality, I would like to:

    1. Remove Subscription
    2. Remove SUB DB
    3. (Re)Create Subscription with the option of adding the new DB

    By this I hope to avoid going through the painfully slow Snapshot generation and use the latest one.

    Do you think this is an o.k. way to go or is there a more efficient way to this end (e.g. just remove the SUB DB and re-create it from the recent snapshot - but this is something I would not know how to do)?

    With thanks and kind regards,

    Bogdan


    BogdanU



    • Edited by BogdanU Monday, October 14, 2019 12:23 PM
    • Marked as answer by BogdanU Monday, October 14, 2019 12:28 PM
    Monday, October 14, 2019 10:20 AM
  • Replication does not use Linked Servers, so that did not affect your subscription.

    If you need to reinit your subscription, you only have 2 choices, snapshot or restore from backup.  You must do one or the other.

    See:

    https://docs.microsoft.com/en-us/sql/relational-databases/replication/initialize-a-transactional-subscription-from-a-backup?view=sql-server-ver15

    • Marked as answer by BogdanU Monday, October 14, 2019 12:29 PM
    Monday, October 14, 2019 11:51 AM

All replies

  • The "Shutdown time-out" is the amount of time AFTER SQL Agent is asked to terminate, SQL Agent waits for running jobs to finish before hard terminating them and terminating SQL Agent.  You should never need to change the default value.

    Transactional replication is extremely resilient and will recover from almost any kind of problem by itself.  

    What exactly is the problem you are experiencing?

    Also, I highly recommend against running Transactional replication in "continuous" mode and use a schedule instead.  I recommend setting it to run every 5 minutes.  Yes, that means you might have 4 min 59 secs of delay between transactions.  But the benefits are you will get a log every 5 minutes for debugging and if the job fails for any reason, it will restart in itself again, and it does not get hard killed by shutting down SQL Agent.


    Thursday, October 10, 2019 2:07 PM
  • Hello, Tom.

    Thank you for the fast and helpful answer (especially for the tip about having the Replication on a schedule instead of continuous - I have not given this a thought at all).

    The issue I was having with Replication seemed to have come from the change in Linked Server parameters and the changes that the infrastructure colleagues have been implementing. 

    I would like to use this opportunity to ask you another thing, please:

    Now that I have a SUB DB which is grossly out of sync with reality, I would like to:

    1. Remove Subscription
    2. Remove SUB DB
    3. (Re)Create Subscription with the option of adding the new DB

    By this I hope to avoid going through the painfully slow Snapshot generation and use the latest one.

    Do you think this is an o.k. way to go or is there a more efficient way to this end (e.g. just remove the SUB DB and re-create it from the recent snapshot - but this is something I would not know how to do)?

    With thanks and kind regards,

    Bogdan


    BogdanU



    • Edited by BogdanU Monday, October 14, 2019 12:23 PM
    • Marked as answer by BogdanU Monday, October 14, 2019 12:28 PM
    Monday, October 14, 2019 10:20 AM
  • Replication does not use Linked Servers, so that did not affect your subscription.

    If you need to reinit your subscription, you only have 2 choices, snapshot or restore from backup.  You must do one or the other.

    See:

    https://docs.microsoft.com/en-us/sql/relational-databases/replication/initialize-a-transactional-subscription-from-a-backup?view=sql-server-ver15

    • Marked as answer by BogdanU Monday, October 14, 2019 12:29 PM
    Monday, October 14, 2019 11:51 AM
  • Hi, Tom.

    Thanks once more.

    Oh - alright, I learned one more thing then (as I was under impression that it was using LS).

    O.K. - that's perfectly fine, and all I wanted to know (what my options were).

    Many thanks and kind regards,

    Bogdan


    BogdanU

    Monday, October 14, 2019 12:28 PM
  • This looks like a problem with your SQL server agent and not with the replication subsystem. Transactional and snapshot replication use the linked server repl_distributor to communicate with with the publisher and distributor (even if they are on the same machine). But I do not believe this is your problem.

    Scheduling the agent as Tom suggests, is the best workaround here.

    Monday, October 14, 2019 4:26 PM
    Moderator
  • Hi, Hilary.

    Thank you for the valuable input (as always).

    Aha - so my impression was correct the first time. :)

    Alright, then please let me ask if you have any idea how the following is possible:

    1. I go to PUB > LS > repl_distributor (and I see Catalogs alright) > Test Connection:
      The result is the error "Server repl_distributor" is not configured for DATA ACCESS"
    2. I go to the LS's properties and see that it indeed IS enabled for Data Access
    3. I further go to the DIST Server (we have a dedicated, separate DIST Server) and, sure enough, "Allow Remote Connections..." is checked
    4. In the end, the Replication(s) seem to work (at least when I enter some testing tuples in a PUB DB)

    The above facts, to my understanding, should not exist/hold true at the same time, yet there they are...so please, if you could help me understand - how is it possible that the Test Connection does not work, yet the Replications seem to work fine?

    If of any help, the current settings of my repl_distributor LS are:

    • Collation Compatible: False
    • Data Access: True
    • RPC, RPC Out, User Remote Collation: True
    • Distributor: True
    • Publisher, Subscriber, Lazy Schema Validation: False
    • Enable Promotion of Distributed Transactions for RPC: True

    With thanks and kind regards,

    Bogdan


    BogdanU

    Tuesday, October 15, 2019 1:20 PM