none
How to perform an alter on a table with millions rows RRS feed

  • Question

  • Hello,

     I need to do an alter to change on the column type of a table that has more than 20.000.000 rows. If I execute a normal Alter statement the query runs for more than an hour and ends failing when it runs out of disk space (it eats more than 20 GB of space). The database is set to simple model (not full recovery), I though that would prevent the log from growing and take all the disk, but I guess SQL Server is keeping a temporary log to be deleted when the operation is succesfully completed (only that it runs out of space before that).

    The only sort of solution I found so far is make a copy of the table with the column I want to modify set to the right data type, select insert data from first table to the second table, drop the first table, rename second table, and redo foreign key, index, etc.

    There is a less convoluted way to solve the problem?

    Thanks

    Wednesday, January 4, 2017 11:10 PM

Answers

  • Hi yukiarkh,

    As mentioned by Vishe, using a new column and droping the previous column is a good choice. However because you have enabled foreign key and index, we need to take them into consideration. Please use the following steps to change column type:

    1. Drop any indexes related to the previous column, and disable triggers.
    2. Add the new column with the new column type.
    3. Update the new column with the previous column.
    4. Drop the previous column and rename the new column.
    5. Rebuild indexes and update statistics.

    For your previous doubt 'The database is set to simple model (not full recovery), I though that would prevent the log from growing and take all the disk, but I guess SQL Server is keeping a temporary log to be deleted when the operation is succesfully completed (only that it runs out of space before that)', in this scenario, SQL Server will not use a temporary log, the usage of disk is related with transaction log itself. The simple mode can also cause the growing of transaction log. 

    In your scenario, you have enabled index on this table and will change the type of this column, this will cause a very long transaction. In simple mode, until a full transaction is complete, the space of this part of transaction log will be recycled. This will cause the increasing of the usage od disk. For more information, please refer to this article.

    If you have any other questions, please let me know.

    Best Regards,
    Teige


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Thursday, January 5, 2017 2:11 AM
    Moderator

All replies

  • Hi yukiarkh,

    create a new column with correct data type to the table, Update the column with data from old column, then drop the old column.

    http://stackoverflow.com/questions/4311559/sql-server-performance-for-alter-table-alter-column-change-data-type


    Please Mark it as Answered if it answered your question OR mark it as Helpful if it help you to solve your problem.

    Thursday, January 5, 2017 12:27 AM
  • Hi yukiarkh,

    As mentioned by Vishe, using a new column and droping the previous column is a good choice. However because you have enabled foreign key and index, we need to take them into consideration. Please use the following steps to change column type:

    1. Drop any indexes related to the previous column, and disable triggers.
    2. Add the new column with the new column type.
    3. Update the new column with the previous column.
    4. Drop the previous column and rename the new column.
    5. Rebuild indexes and update statistics.

    For your previous doubt 'The database is set to simple model (not full recovery), I though that would prevent the log from growing and take all the disk, but I guess SQL Server is keeping a temporary log to be deleted when the operation is succesfully completed (only that it runs out of space before that)', in this scenario, SQL Server will not use a temporary log, the usage of disk is related with transaction log itself. The simple mode can also cause the growing of transaction log. 

    In your scenario, you have enabled index on this table and will change the type of this column, this will cause a very long transaction. In simple mode, until a full transaction is complete, the space of this part of transaction log will be recycled. This will cause the increasing of the usage od disk. For more information, please refer to this article.

    If you have any other questions, please let me know.

    Best Regards,
    Teige


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Thursday, January 5, 2017 2:11 AM
    Moderator
  • Some DDL changes require every row to be updated.  For large tables, I suggest you create a new table using SELECT...INTO with an explicit CAST of the column(s) to be changed to the new type.  This will be minimally logged in the SIMPLE or BULK_LOGGED recovery model.  You can then drop the old table, rename, and then recreate constraints and indexes.


    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Thursday, January 5, 2017 2:15 AM
    Moderator
  • I would bcp the data out using the queryout option.

    Your bcp statement would look like this:

    bcp "select *, convert(int, 1) NewColumnName from databaseName.dbo.TableName" queryout DataFilename.dat -S. -n -T

    then I would create the new table with a new name in the same database and then push the data in

    bcp "DatabaseName.dbo.NewtableName" in dataFileName.dat -S. -n -T -E -b 10000

    Ensure the PK is on the new table. When the bcp operation is complete apply the indexes and then rename the tables.

    There will be an outage when you are doing this.

    You can also use replication to replicate the data to the new table in the same database with no outage, or a brief outage when you rename the tables.

    Thursday, January 5, 2017 3:01 AM
  • If not the select into, then write some code to move "just" 100k or so rows at a time into the new table, it won't run faster but in simple mode it won't run out of log space either.

    Josh

    Thursday, January 5, 2017 5:19 AM