The above topics may give an idea of the correct strategy for deletion. Very often the following recommendation is given in forums:
The above strategy may work on a single server but cannot be used in a High Availability scenario because it requires the FULL recovery mode! This article compares two different solutions and describes the actions "behind the scene".
The first script will create a demo database and fill a demo table with 100.000 records. This table will be used for the investigation of all next scenarios:
FILENAME = N
MAXSIZE = 1000MB,
FILEGROWTH = 100MB
MAXSIZE = 1GB,
-- change the recovery model to SIMPLE
-- Now create a table with a clustered index on an Id and add 100.000 records
IDENTITY (1, 1),
'only stupid stuff'
'more stupid stuff'
The above recommendation can be read quite often based on questions for fast deletion of data. This solution is not feasible because the DML-Operation DELETE is a fully logged operation which does not support BULK-operations. The following script runs a
DELETE operation against the above table and deletes all even numbered records. This operation will be run in all three recovery modes (SIMPLE, FULL, BULK_LOGGED)
-- Change the recovery model for each run!
-- delete half of the records
Id % 2 = 0;
-- rebuild the index because it's fragmented
database_id = db_id('DeleteRecord');
The DELETE operation is included in a dedicated transaction to measure the amount of log space which will be written within the transaction. After half of the records have been deleted the clustered index need to be rebuild as part of the complete DELETE-process
because of a low density in the data pages (only 50% filled) after DELETE has finished. The ROLLBACK TRANSACTION has been used to avoid the rebuild of the test scenario for each different test.
Concerning the above operational process the generated amount of transaction log is as follows:
The result shows that BULK_LOGGED as a recovery model will not change significantly the amount of produced transaction log. And the result is not really surprising because DELETE is no DML-Operation which supports BULK-operations. If the recovery model is
set to FULL then all operations are full logged! So the above strategy doesn't seem to be the best choice concerning the produces amount of transaction log space. Some people recommend to delete data in small portions and backup the log - but this scenario
isn't considered for this article.
The usage of TRUNCATE instead of DELETE seems to be inviting, because TRUNCATE does not log the deletion of each record. TRUNCATE is a DDL (Data Definition Language) operation and not a DML (Data Manipulation Language) operation and only the system tables
are part of the transaction. TRUNCATE removes the complete metadata structure of the underlying table and ALL records. So a step in between has to be done: All records which should stay need to be stored in a staging table. The next code demonstrates the process
-- Drop the staging table
-- Bulk logged
Id % 2 = 1;
-- minimal logged because DDL-Operation
dbo.bigtable (Id, c1, c2, c3)
Id, c1, c2, c3
database_id = db_id(
Why does the amount of transaction log action is so high for the INSERT ... INTO ... SELECT *? The INSERT .. INTO ..-Command can only benefit from a BULK LOGGED operation if it is exclusively locked. An exclusive lock can be forced with the TABLOCK hint
as the following scenario shows.
-- Bulk logged because target table is exclusivly locked!
(TABLOCK) (Id, c1, c2, c3)
The difference in the amount of generated transaction log is tremendous:
The recovery model FULL will not cover any BULK logged operation and will always force fully logged operations. Only the recovery model SIMPLE and BULK_LOGGED give the possibility to work with minimal logged operations.
Due to the growing amount of data the requirement for fast workloads are growing simultaneously. Microsoft SQL Server will meet these requirements with bulk logged operational functionality. Basically a difference between DML-Operations which support bulk
logged operations and DML-operations which does not has to be considered.
DELETE is not a DML-operation which support bulk logging and if a table does not have constraints which prevents a TRUNCATE than the following workload seems to be the best by consideration of bulk logged operations:
SELECT...INTO and INSERT INTO both support bulk logged operations and TRUNCATE is a tremendous way to get rid of data because it is only schema changing DDL!
INSERT INTO ...:
The article may benefit from the See Also section. I also don't understand a usage of 'relation' in that article. What do you mean by 'relation'
Congratulations for winning a TechNet Guru gold medal: blogs.technet.com/.../technet-guru-awards-october-2013.aspx
One more comment - please don't include line numbers with the code samples - they may it more difficult to copy code
This article was featured on the home page of TNWiki! social.technet.microsoft.com/wiki
promised - next article without numbered lines :)
Why do you need ORDER BY id in the INSERT INTO command?
I've removed line numbers from code samples and as you can see code formatting somehow changed. It would be great it you'll be able to fix this.
I've tried to bring the article (codes) in format.
Next time I would appreciate to do it by yourself or leaf it as it is.
As you may see I've promised to post next articles without lines ;)
Great Info...Thanks for sharing