Even though creating a columnstore index on a table makes the table non-updateable, meaning that bulk load and update DML operations on the table will fail, you can still add data to a table with a columnstore index with partition management operations.
This can make your ETL process run faster because the alternative is to drop the columnstore index, do your load or update, and then rebuild the index. Since older data is normally not updated, this can result in losing the work required to index older data,
and then having to redo it. The following example shows how to add data using partition switching.
This example shows how to add data to a partitioned table with a columnstore index
using partition switching operations, in a way that can be repeated every time new
data needs to be added, assuming data is added once a day.
-- Clean up objects left behind from prior runs.
is not null
drop table Fact;
is not null
drop table Staging;
drop partition scheme
drop partition function
-- Create a partition function with 3 partitions, and an associated partition scheme.
left for values
ps as partition
-- Create a partitioned table.
-- Add some data, leaving an empty partition on each end.
Fact values(20111013, 10);
-- Create a partition-aligned columnstore index on the table.
-- Show the data and the partition number of each row.
-- Split the uppermost partition (which is now empty, so this is a fast, metadata-only operation).
split range (20111014);
-- Create a staging table with the same schema as the partitioned table,
-- to hold data to be appended to it via a partition swtich.
null, measure int);
-- Create a constraint so that the staging table has the same bounds as the newly created partition.
with check add
> 20111013 and
date_key <= 20111014)
-- Add data to the staging table.
Staging values (20111014, 60);
-- Create a columnstore on the staging table with the same columns in it as the
-- columnstore index on the target partitioned table.
-- Switch the staging table into the proper empty partition of the target table.
-- Afterwards, an empty partition remains as the rightmost partition,
so it can be split later as a metadata-only operation.
switch to Fact
-- Retrieve the rows in the partitioned table, showing the partition of each row.
Existing techniques and guidelines for using SQL Server table partitioning are documented in other places, and in general they still apply with columnstore indexes. Below are links to other partitioning resources:
Thanks for posting these, Eric. I have a 1tb table (compressed) that's already been separated into live/archive so that I could do this when 2012 came out. We'll see how it does, and if it's as fast as you guys have said for our workload. Thanks also for all the other workarounds - I've written them down since I have no doubt we'll use them. FWIW, ColumnStore and AlwaysOn are responsible for our early adoption of 2012.
How would this work in an Always On setup and when you have per partition schema two data files. I assume having all the partitons on on data file doesn't make much sense in reality since columnstore index is for big amount of data.