I know this is a broad area to discuss but I would like to get some recommendations before I kick off the implementation in SQL 2012.
I've 3 tables each storing more than 5 million rows and growing further. Currently these tables (DW) get's loaded overnight and users run heavy queries through reporting tools on them during the day. We are planning to convert these fact tables to be real time with the transactional application.
What would be the best approach to convert these tables to be real time against the source application with maximum performance, minimize data latency etc..
- Is it possible to update these tables simultaneously allowing users to query with out locking issues?
- Should we go for SQL Server replication options?
- Is there any better workaround ?
Our experience was always get into performance issues while doing updates that result into dead lock often.
I really appreciate your help on this.
Thanks in advance.
As far as I know and understand, when you are doing UPDATEs in SQL Server, you have the NOLOCK hint for the UPDATE statement but it is not recommended because it could cause a SELECT statement to return inconsistent data. You can read more about it here: http://www.mssqltips.com/sqlservertip/2470/understanding-the-sql-server-nolock-hint/
For the real time replication side, you could take a look at the change data capture feature which can be used along with an SSIS package (which can be set to run at every 5 seconds, for example):
- http://msdn.microsoft.com/en-us/library/hh758674.aspx (Change Data Capture (SSIS))
- http://msdn.microsoft.com/en-us/library/hh758674.aspx (CDC Control Task)
For the performance tuning of CDC you can read more about here:
- http://technet.microsoft.com/en-us/library/dd266396(v=sql.100).aspx (Tuning the Performance of Change Data Capture in SQL Server 2008 - should apply for 2012 as well with some differences)
"Change data capture is a new feature in SQL Server 2008 that provides an easy way to capture changes to data in a set of database tables so these changes can be transferred to a second system like a data warehouse. This document provides guidance on how to configure change data capture parameters to maximize data capture performance while minimizing the performance impact on the production workload. The scope of this document is limited to the capture of change data and the cleanup process. Querying the changed data is out of scope for this white paper."
Be aware that CDC requires the source to be SQL Server, which you haven't specified if it is your source type or not.
Please let me know if you have any further questions and I hope that i managed to respond at least to a part of your questions :).
Per aspera ad astra!