none
Start Snapshot Agents for all publication

    Frage

  • We have 103 publications. We have a DR solution where the Prod DR needs to replicate to DW DR. I copied the script to create the replication and related objects and was successful to set the replication. Now the snapshot agents need to be started. They do not start all in one go. I have to wait hours and manually start each one after a previous one is complete. 

    I tried to create a script using waitfor delay, but it too is cranking. 

    Is there a easy way to get all the agents to start and sync all at once. 

    Mittwoch, 27. Juni 2018 13:21

Alle Antworten

  • Do you have 103 databases? You should have as few publications as possible. I try to have 1 per database.

    Replication is not a good solution for DR. It does not copy all objects and you have no failover or predictable latency.

    The reason not all of your snapshot agents start is likely due to desktop heap exhaustion.

    https://support.microsoft.com/en-us/help/949296/some-sql-server-replication-agents-cannot-run-when-you-configure-many

    You also might be running into a problem with worker threads.

    Mittwoch, 27. Juni 2018 13:43
    Moderator
  •  They do not start all in one go. I have to wait hours and manually start each one after a previous one is complete. 

    Do you mean they do not start altogether or start but don't progress? If you start all of them, they should start but the progress would be very slow if you start all of them in parallel. Also, is the snapshot folder (where snapshot files are stored) on the local box as the subscriber?

    By the way, running snapshot should be a one-time effort. Once synced, you won't have to run it often (Of course, unless a reinitialization is needed).


    Please remember to click "Mark as Answer" if my response answered your question or click "Vote as helpful" if it helped you in any way.

    Mittwoch, 27. Juni 2018 16:16
  • Thanks for the reply. No there are 103 Publications for 1 DB. Its a legacy system and the upgrade plan is in process. Our DR solution is Log shipping.. ( works fine for all systems across the shop) . For one particular case - here is the scenario- 

    A few tables from prod are replicated to a DW server, where the reporting runs.  In case of Disaster, the Prod DR is bought online ( Log Shipping Secondary).  Now the DW DR needs the few replicated Tables, so what I do is Create the Replication using the script.. Works fine. 

    The snapshot agents now need to be started one by one. Here is the Script I tried to Use, but the waitfor delay is not working that great - 

    DECLARE @pub as varchar (255)
    DECLARE C CURSOR  for 
    select publication   from [distribution].[dbo].[MSsnapshot_agents] 
    	where
      publisher_db ='DatabaseName'
      
    	open C
    		FETCH NEXT FROM C INTO @pub
    		while (@@FETCH_STATUS = 0)
    			BEGIN 
    				use DatabaseName
    				exec sp_startpublication_snapshot @publication = @pub
    				
    				WAITFOR DELAY '00:00:15' 
    				FETCH NEXT from C into @pub
    			END 
    	CLOSE c 
    	DEALLOCATE C 
    	GO 
      
    Above is the best I have to save DBA time and monitoring, but it helps me only to get 50% agents started.
    Donnerstag, 28. Juni 2018 11:54
  • yes. i use above code to start them, the sql output says - snapshot agent xxxxxx started successfully. But if any other agent is in progress it errors out and then I have to manually start them.. 

    (This has become a pain - this is MS's worst bit even bad than IE :) )

    Donnerstag, 28. Juni 2018 11:56
  • I am not sure why there was a need to have 103 publications for a single DB when you could very well have a single publication with 103 subscriptions (or even more). Is it because each publication has different articles selected for replication? that could be one of the reasons.

    As for the issue in question, I can think of three potential workarounds:

    1. Increase the WAITFOR DELAY to, maybe 30 seconds or an even larger number and see which one fits.

    2. Instead of looping through the Snapshot agents, maybe change your code to loop through the SQL agent snapshot agent jobs and start those jobs instead. I know it does the same thing under the hood but won't hurt to try. 

    3. When a disaster strikes after prodDW and Log Shipping DR (secondary) are all in sync then you can actually start replication from your log shipping DR to your DW without having to reinitialize with a new snapshot. In other words, replication would continue assuming they are both (DR and DW) in sync already. But when there are pending records in prod that have yet to be log shipped to DR or DW replicated tables aren't synced, then this might not help in which case go with #4 below. Read this to know how.

    4. If #3 doesn't work, reinitialize your subscriptions using backups and see how much time it takes as compared to the one you are currently using (i.e. reinitialize with a new snapshot) 

    Of course, you want to try these in a lower environment first.


    Please remember to click "Mark as Answer" if my response answered your question or click "Vote as helpful" if it helped you in any way.

    Donnerstag, 28. Juni 2018 12:53
  • Abhishek - I am pretty sure you are running into a worker threads issue with SQL Server agent, or the desktop heap issue.

    Check the results of the following queries

    use msdb
    GO
    select max_worker_threads From syssubsystems where subsystem='snapshot'
    
     

    SELECT  count(*) FROM sys.dm_exec_sessions AS s  
    INNER JOIN sys.dm_exec_requests AS r  
    ON s.session_id = r.session_id  
    INNER JOIN sys.dm_os_tasks AS t  
    ON r.task_address = t.task_address  
    INNER JOIN sys.dm_os_workers AS w  
    ON t.worker_address = w.worker_address  
    WHERE s.is_user_process = 0; 

    do this when your snapshots are starting all at once.

    Also look in your SQL Server agent log to see if requests are being queued.

    Donnerstag, 28. Juni 2018 13:38
    Moderator
  • Yes, each Publication has different articles and as i said in a previous comment, this is already complex - the team that created it isnt here anymore :) . 

    Thanks for these options - I will try those. 

    Freitag, 29. Juni 2018 10:32
  • Note that you can create a single publication and each subscription can subscribe to a subset of those articles.
    Montag, 2. Juli 2018 10:39
    Moderator
  • What exactly is the error message you are getting when multiple agents are running?

    Also post the results of SELECT @@VERSION.

    Montag, 2. Juli 2018 11:41
  • As Hilary said, if you have tables in one database to a single destination database, you only need a single publication.  You are not gaining anything by having multiple publications and actually causing yourself a lot of pain managing 103 publications.

    Montag, 2. Juli 2018 11:44
  • Note that you can create a single publication and each subscription can subscribe to a subset of those articles.
     There is not much room to change the exiting scenario at the Prod with Replication.
    Montag, 16. Juli 2018 12:59
  • As Hillary said in his first response, you are almost certainly running into the Agent problem of heap space with too many jobs running at the same time.  Increasing the heap size will work around this issue.

     
    Montag, 16. Juli 2018 14:27