none
Change Data Capture from Replicated Database

    Question

  • This is a multipart question.  Sorry folks but I'm trying to get an answer the first time through so please bear with me.

    I have a SQL Server 2005 replicated database as one of the sources for my data warehouse.  Is there a way to simulate the capabilities of Change Data Capture (CDC) from this version of the database?  If so, how would I go about it?

    If we were to upgrade the version of SQL Server to 2008 or higher, can CDC be used to capture the replicated data as it is being applied.  If so, how would I go about it?

    Finally, if CDC cannot be used at all, what would you recommend (and please be nice)?

    Thanks

    Bigguy365

    Friday, November 01, 2013 12:07 AM

Answers

  • You should not interfere with the ongoing replication, otherwise you introduce contention which will slow the db down.

    With SQL 2012 I'd use an AlwaysOn read only secondary instance in readonly mode to capture the changes for me and have it feed my DW.

    Realtime DWs dont work in general.


    Arthur My Blog

    • Marked as answer by GCSA Saturday, November 02, 2013 1:22 PM
    Friday, November 01, 2013 3:13 PM
    Moderator

All replies

  • I will pepper you with questions:

    What kind of replication?

    Is this DB a subscriber?

    What SQL Server edition?

    What are the data volumes or speed of the data changes?

    The DCD controls were

    CDC is not part of SQL Server 2005.

    SQL Server (SSIS) 2012 is the best to use CDC.


    Arthur My Blog

    Friday, November 01, 2013 1:57 PM
    Moderator
  • What kind of replication?

    Transactional

    Is this DB a subscriber?

    Yes

    What SQL Server edition?

    Currently 2005 Enterprise

    What are the data volumes or speed of the data changes?

    Approximately 30K/day

    The DCD controls were

    ??????

    CDC is not part of SQL Server 2005. 

    Yes -- I understand that.  What other approach is recommended?

    SQL Server (SSIS) 2012 is the best to use CDC.

    Thank you -- did not know this.  So if we can upgrade, we should go to 2012.

    Friday, November 01, 2013 2:09 PM
  • If you can upgrade to SQL 2012 1st that would be ideal. Otherwise the correct approach to use is Slowly Changing Dimentions this how a true DW operates, and then if not change tracking: http://msdn.microsoft.com/en-us/library/cc761496%28v=sql.90%29.aspx exists in SQL Server 2005, also if not CT then implementing the CDC using triggers is a possibility, and finally using SSIS techniques as Get all from Table A that isn't in Table B

    may also be a solution.

    PS: 30 K records/day is quite a large data movement



    Arthur My Blog

    Friday, November 01, 2013 2:20 PM
    Moderator
  • Thank you for your reply.  Let me be clear --- the replicated database is our source for the data in our data warehouse.  We're shooting for a near real time refresh rate (roughly 1 minute between cycles) to get the data and process it through a light weight ETL into a stripped down version of our data warehouse.  Given this information, is there a way of capturing the replicated transactions as they are applied to the subscriber DB?

    I've come across postings mentioning adding a bit field in each record of the tables within the replicated DB that gets set when a insert/update activities take place. Then running queries selecting those records from the replicated DB for the ETL and resetting the field so it doesn't get selected again.  This approach would require having to query all tables in our DB ---  the number and sizes of which will not permit us from achieving the goal of a 1 minute refresh rate.  If we were to have tables of the same structure within the replicated DB, is it possible to load those tables with the records that have been inserted/changed immediately after they have been applied to the replicated table?  Your thoughts.    

    Friday, November 01, 2013 2:58 PM
  • You should not interfere with the ongoing replication, otherwise you introduce contention which will slow the db down.

    With SQL 2012 I'd use an AlwaysOn read only secondary instance in readonly mode to capture the changes for me and have it feed my DW.

    Realtime DWs dont work in general.


    Arthur My Blog

    • Marked as answer by GCSA Saturday, November 02, 2013 1:22 PM
    Friday, November 01, 2013 3:13 PM
    Moderator
  • Thanks again.  Your insight is much appreciated.  
    Friday, November 01, 2013 3:18 PM