none
apply cdc to existing data warehouse

    Question

  • Hi all,

    I'm building an education data warehouse. I have various dimensions that i would like to track changes to i.e. DimStudent, DimClass, DimCollege, DimExamPaper etc. etc. My solution currently drops keys, truncates tables and recreates keys before loading staging tables, which then go on to load Dimension and fact tables.

    Therefore, my question is what are the minimum physical changes i need to put in place to --> truncate staging tables prior to load, track changes between staging tables, dim tables and fact tables, to incrementally load rows to dim tables and fact tables? Do i need to add a table/tables to track changes for EACH dimension or fact table? I have looked at examples but there seems to be a fair amount of work for just one table. I have 11 dimensions, 4 fact tables and 22 staging tables.  Also, do i need a CDC state table for EACH table? Fast response would be greatly appreciated as this is holding me up.  I basically want to know what the best method would be to apply to a data warehouse to track changes with lowish volumes.  (few million rows in total DWH).


    • Edited by jhowe1 Thursday, June 05, 2014 8:50 AM
    Thursday, June 05, 2014 8:50 AM

Answers

  • Hi yea, i looked at that.  Doesn't really help for working with many tables.  After much investigation it looks as if you need to enable CDC on a table by table basis, creating a cdc state table for each CDC enabled table.  This approach may not be suitable for a large amount of tables.  I cannot find any examples for doing this for a large amount of tables.  If someone can find an example it would be much appreciated.
    • Marked as answer by jhowe1 Thursday, June 12, 2014 9:13 AM
    • Edited by jhowe1 Thursday, June 12, 2014 9:22 AM
    Thursday, June 12, 2014 9:13 AM

All replies

  • Hi all,

    I'm building an education data warehouse. I have various dimensions that i would like to track changes to i.e. DimStudent, DimClass, DimCollege, DimExamPaper etc. etc. My solution currently drops keys, truncates tables and recreates keys before loading staging tables, which then go on to load Dimension and fact tables.

    Therefore, my question is what are the minimum physical changes i need to put in place to --> truncate staging tables prior to load, track changes between staging tables, dim tables and fact tables, to incrementally load rows to dim tables and fact tables? Do i need to add a table/tables to track changes for EACH dimension or fact table? I have looked at examples but there seems to be a fair amount of work for just one table. I have 11 dimensions, 4 fact tables and 22 staging tables.  Also, do i need a CDC state table for EACH table? Fast response would be greatly appreciated as this is holding me up.  I basically want to know what the best method would be to apply to a data warehouse to track changes with lowish volumes.  (few million rows in total DWH).


    can someone help with this pls
    Thursday, June 05, 2014 2:17 PM
  • Hi jhowe1,

    You may wanna walk through the following excellent blog:
    http://www.mattmasson.com/2011/12/cdc-in-ssis-for-sql-server-2012-2/


    Regards, Leo

    Sunday, June 08, 2014 3:19 PM
  • Hi yea, i looked at that.  Doesn't really help for working with many tables.  After much investigation it looks as if you need to enable CDC on a table by table basis, creating a cdc state table for each CDC enabled table.  This approach may not be suitable for a large amount of tables.  I cannot find any examples for doing this for a large amount of tables.  If someone can find an example it would be much appreciated.
    • Marked as answer by jhowe1 Thursday, June 12, 2014 9:13 AM
    • Edited by jhowe1 Thursday, June 12, 2014 9:22 AM
    Thursday, June 12, 2014 9:13 AM