Many customers have a need to efficiently track what has changed in their SQL Azure database from one point in time to another. They might need to use this method to efficiently download the latest set of products to their mobile user’s smartphones, or they may want to import data on-premises to do reporting and analysis on the current day’s data.
The biggest problem is that unlike SQL Server 2008’s which has Integrated Change Tracking and Change Data Capture, SQL Azure does not currently provide a method of change tracking data. This is unfortunate, because the alternative requires the addition of complex triggers to track the changes. It also requires the addition of tracking columns to track rows that were changed. Luckily, the method I am going to explain can greatly simplify the process and also reduce the potential that the schema changes would impact your existing applications. However, I do suggest that you try this method on a test database before applying it to a production system, just in case. J I also will point out that this mechanism will work just as well with SQL Server, but you will likely prefer to use something like Integrated Change Tracking.
To get started, you will need:
Assume that you have some knowledge of SQL Azure and how to create a database. If that is not the case, please take a look here for more details on how to get started.
The only caveat to this is that the tables you want to add change tracking need to have Primary Keys.
The magic of the change tracking is a Template within Visual Studio called “Local Data Cache”. The purpose of this template is really to jump start the creation of an offline application, but the real benefit is how it helps to generate change tracking in SQL Server or SQL Azure.
To get started:
It actually doesn’t really matter what type Project you choose as we will just be using the Template and need a blank project to launch them.
This will bring up the Installed Templates.
If you happen to get an error, it is likely due to the fact that you did not install “Microsoft Sync Framework 1.0 SDK” with Visual Studio. If this is the case you can find the install for this in the Microsoft Download Center.
At this point a configuration tool will launch and walk you through the steps for enabling change tracking.
At this point, I am assuming you have created a SQL Azure database. If that is not the case, please go ahead and create the database.
First we need to create a connection to SQL Azure. To do so:
This will bring you back to the original configuration page, but will populate your server connection in the pull down. It will also configure a default Client connection but you can ignore this.
The next step will be to choose the tables you want to add change tracking to. To do this:
You should see a list of tables from your database (as follows).
For this demo, I am just going to choose two tables for change tracking.
If you highlight one of the selected tables, you will see some new options enabled to the right.
This is useful because it allows you to choose what you want to track. For example, if you did not have a need to track deletes. You could deselect this option. If you were to choose the “Edit” button you can also see how to change the default name used to create the separate tracking table. For now, we will just leave all the defaults:
This brings us back to the original configuration page and now lists the two tables we are going to track. Once again:
On the resulting page you will see an option to either apply the changes immediately to your database or create scripts (the default is to do both).
If you get this page, just choose “Yes”.
We will skip the Database Model creation:
At this point you are done with the configuration. If you view your database in SQL Server Management Studio it should look like this:
Notice the two new “_Tombstone” tables that were created.
If you expand either the Customers or Orders table, you should also notice the addition of “LastEditData” and “CreationDate” which tells when this row was last edited and when it was originally created (Inserted).
Finally let’s do a few queries to see how all of this work.
Notice how the LastEditDate and CreationDate are both set to the current datetime for this inserted row.
Notice this time that only the LastEditDate column is updated to reflect that the time this row was last changed.
Finally let’s execute a delete:
Notice this time, we did a select from the tombstone table. Why did we have to do this? Well if the row was deleted from the original table, how would we be able to tell it was deleted? If we move the Primary Key of the row to a separate tracking table and include the time it was deleted we can very efficiently tell what has changed.
Hopefully at this point you have it pretty much figured out. Using any of the above queries you can tell when a row was last changed. The only thing you need to do is store the timestamp the client last retrieved the data and use it in your queries.
For example:
After that you simply repeat the steps to get the incremental changes. You can hopefully imagine how you could embed these types of queries into a REST, OData or ADO.NET application.
John has been building cloud based services on Windows Azure since its inception. His specialty is in working with databases such as SQL Azure. John is working at Cotega a SQL Performance Monitoring service that provides database notification and alert services for cloud hosted SQL Server and SQL Azure database.