none
Forum FAQ: How do I remove duplicate rows from a table in SQL Server?

    General discussion

  • Question:

    There are many duplicate rows in a table. How do I remove them?

    Answer:

     Please refer to the KB article below to locate and remove duplicate rows from a table:

     How to remove duplicate rows from a SQL Server table by using a script

    http://support.microsoft.com/kb/70956

    How to remove duplicate rows from a table in SQL Server

    http://support.microsoft.com/kb/139444

     

    For SQL Server 2005 and SQL Server 2008, we can also use the ROW_NUMBER function and the Common Table Expression (CTE) to remove duplicate rows.

     

    Sometimes there are some rows which are partly duplicated. For example, in the following sample, some rows are the same in Col1 and Col2 but different in Col3. We can keep the rows which have min or max values in Col3 and remove other rows.

     

    DECLARE @Table_1 TABLE (Col1 int,Col2 int,Col3 datetime)

     

    INSERT INTO @Table_1 VALUES (1,1,'2009-12-04')

    INSERT INTO @Table_1 VALUES (1,2,'2009-12-04')

    INSERT INTO @Table_1 VALUES (1,2,'2009-12-05')

    INSERT INTO @Table_1 VALUES (2,2,'2009-12-04')

    INSERT INTO @Table_1 VALUES (3,1,'2009-12-04')

    INSERT INTO @Table_1 VALUES (3,1,'2009-12-03')

     

    ;WITH CTE (Col1,Col2,Col3,RowNumber)

    AS

    (

    SELECT Col1,Col2,Col3,

    ROW_NUMBER() OVER (PARTITION BY Col1,Col2 ORDER BY Col3 DESC) AS 'RowNumber'

    FROM @Table_1

    )

     

    DELETE FROM CTE WHERE RowNumber>1

     

    If the rows are same in all columns, we can just specify either column as the sort column in the ORDER BY clause.

     

    ;WITH CTE (Col1,Col2,Col3,RowNumber)

    AS

    (

    SELECT Col1,Col2,Col3,

    ROW_NUMBER() OVER (PARTITION BY Col1,Col2,Col3 ORDER BY Col1) AS 'RowNumber'

    FROM @Table_1

    )

     

    Related documents:

     

    ROW_NUMBER (Transact-SQL)

    http://msdn.microsoft.com/en-us/library/ms186734.aspx

    Using Common Table Expressions

    http://msdn.microsoft.com/en-us/library/ms190766.aspx

    Monday, April 12, 2010 3:27 AM

All replies

  • Just one note to add...

    It is not necessary to include the column names in the CTE.  Just having the ROW_NUMBER() function is enough, as illustrated below.  However, that being said, some people like to have the column numbers there in order to test the CTE with a SELECT rather than a DELETE.

    ;WITH CTE AS

    (

    SELECT ROW_NUMBER() OVER (PARTITION BY Col1,Col2 ORDER BY Col3 DESC) AS 'RowNumber'

    FROM @Table_1

    )

    DELETE FROM CTE WHERE RowNumber>1

     

     


    --Brad (My Blog)
    Monday, April 12, 2010 4:47 PM
    Moderator