none
Create special backup database

    Question

  • I am trying to create a duplicate copy of seven tables in a database.  The destination database will reside on a different server in another city of the state and the database will have a different name and is running SQL Server 2008 R2 Enterprise Edition.  One of the purposes of the destination database is to keep historical data so the source database can have old data deleted to free up space (The source database is a SQL Server 2008 R2 Express edition).  The source database has an application that inserts statistical data all day long (I believe 24 hours - 3 shifts).  They want current data but not necessarily real time in the destination, maybe backuped or refreshed every hour or less.  I believe an hours worth of data that is added to each source table is between 1000 and 2000 records.  There may be some updates (changes to existing data).  Currently, each source table has around 2,000,000 records.

    What is the best way to perform this operation?  Is SSIS the best way?  I read some articles suggesting Transactional Replication, but I do not want any data removed because it was deleted in the source tables.

    Thank you for your help,

    Fred


    Fred Schmid


    • Edited by CRdesigner Thursday, October 17, 2013 5:07 PM
    Monday, October 14, 2013 8:13 PM

Answers

All replies

  • You are describing the creation of a data warehouse.  There are probably infinite ways of doing this.  I have experience with a solution where I used Transactional Replication to replicate data to staging tables on my DW instance.  I then ran a periodic MERGE to my DW tables (from the staging tables) using the WHEN NOT MATCHED BY TARGET THEN INSERT and the WHEN MATCHED THEN UPDATE clauses.  Note that the solution does not delete when not matched by source, thus preserving historical data.  I also computed a persisted CHECKSUM to help identify changes. There is more to it than this simple overview, but it has worked well so far on tables consisting of over 7,000,000 rows, although I don't truly know how scalable it is.  Replication, of course, allows selective publication of only the data that needs to be retained, which might not be everything, and could therefore help keep things manageable.

    Jason



    • Edited by Jason Brugger Monday, October 14, 2013 11:55 PM updated an unclear sentence
    • Marked as answer by CRdesigner Thursday, October 17, 2013 2:10 PM
    • Unmarked as answer by CRdesigner Thursday, October 17, 2013 6:16 PM
    Monday, October 14, 2013 11:36 PM
  • Hi,

    For your scenario maybe it would be best to implement transactional replication from source database to staging database on destination server. This is a lightweight process that would not interfere much with existing processes on source database. When defining the articles for the publication just say that @del_cmd is none and deletes from source system will not be propagated to destination staging database.

    On destination server you need to build data warehouse that would take data from staging database, transform it if needed and load it to the warehouse.

    On source server you need to schedule a process that will delete historic data.


    HTH, Regards, Dean Savović, www.comminus.hr/en/

    • Proposed as answer by Jason Brugger Tuesday, October 15, 2013 3:52 PM
    • Marked as answer by CRdesigner Thursday, October 17, 2013 2:10 PM
    • Unmarked as answer by CRdesigner Thursday, October 17, 2013 6:16 PM
    • Unproposed as answer by CRdesigner Tuesday, October 22, 2013 3:15 PM
    Tuesday, October 15, 2013 8:52 AM
  • Jason,

    Thank you for your response.  This is probably due to my ignorance on the subject, but why do we need staging tables?  Is there a difference between the data that would be in the staging tables and the data in the DW tables?  We want an exact copy of the source data in the destination tables.

    Also, the source database is "owned" by another division of the company and I think it is a third party application database.  Does the "Publication" reside in that database or can it reside on our Distribution database with pointers to the third party database?

    Thanks,

    Fred


    Fred Schmid

    Tuesday, October 15, 2013 3:40 PM
  • In my model, the staging table would be an exact copy of production whereas the DW tables would contain all current, plus historical data.  However, Dean's suggestion of leveraging the @del_cmd parameter is compelling in its simplicity and doesn't require the additional engineering of my solution (unless you also need to transform data, that is).

    Jason

    Tuesday, October 15, 2013 3:52 PM
  • I think Dean has nailed it with the simplest solution.  Just wanted to add that, if one is more comfortable setting up replication via the GUI, this option can also be configured in the article properties.


    Jason

    Tuesday, October 15, 2013 6:16 PM
  • Jason,

    I am getting an error "This edition of SQL Server cannot act as a Publisher or Distributor for replication."  I am using SQL Server 2008 r2 Express edition for the "source" database.  But now I am learning that it can only be used as a Subscriber. 

    What do I do now? 

    Can I use the Subscriber as the source?  I would not want anything "ever" written to this source database.

    Also, right now we do not want to transform the data. We want an exact copy of the source data in the destination database.

    Thanks,

    Fred


    Fred Schmid

    Wednesday, October 16, 2013 6:34 PM
  • I have moved this discussion to the Replication forum:

    Trying to figure out merge replication using SQL Server 2008r2 Express

    Use this link.


    Fred Schmid

    • Marked as answer by CRdesigner Thursday, November 14, 2013 8:22 PM
    Thursday, November 14, 2013 8:21 PM