none
strange behavior with in-line string concatenation

    Question

  • Hello!

    We recently upgraded to SQL 2012 (64 bit).

    We are noticing a difference in behavior as it relates to string concatenation.

    Part 1 of the following script creates a test table populates with the data.

    Part 2 of the script populates a varchar variable with some values concatenated from the table, and works as expected.

    Part 3 of the script populates a varchar variable with some values concatenated from the table, and does not work as expected.  The only different between part 2 nd part 3 is the order by clause.

    Part 4 drops the index on the table.

    Part 5 repeats Part 3, but it returns the proper results now that the index is gone.

    Do you think this is a bug in the Database engine?

    Thanks in advance.

    Part 1:


    create table dbo.test_table (id int, PaidToDate_ID int,
                                 PaidToDate_Code varchar(3),
                                 PaidToDate_Amount money)

    create index ix_test_table
       on dbo.test_table (id)

    insert into test_table
      select 1, 1, 100, 1000.00

    insert into test_table
      select 1, 2, 100, 1500.00

    insert into test_table
      select 1, 3, 100, 300.00

    insert into test_table
      select 2, 1, 200, 2000.00

    insert into test_table
      select 2, 2, 200, 3400.00

    insert into test_table
      select 2, 3, 200, 200.00

    insert into test_table
      select 2, 1, 300, 1100.00

    insert into test_table
      select 2, 2, 300, 500.00

    insert into test_table
      select 2, 1, 300, 333.00

    Part 2:


    declare @id integer
    declare @paidtodatestring varchar(max)
    set @id = 2
    select @paidtodatestring = ''
    select @paidtodatestring = @paidtodatestring +
    e.paidtodate_code + replicate(' ',(3-len(e.paidtodate_code)))+  
    replicate('0',(11-len(REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')))) + REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')
    from dbo.test_table e
    where e.id = @id
    select @paidtodatestring

    Part 3:

    declare @id integer
    declare @paidtodatestring varchar(max)
    set @id = 2
    select @paidtodatestring = ''

    select @paidtodatestring = ''
    select @paidtodatestring = @paidtodatestring +
    e.paidtodate_code + replicate(' ',(3-len(e.paidtodate_code)))+  
    replicate('0',(11-len(REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')))) + REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')
    from dbo.test_table e
    where e.id = @id
    order by e.paidtodate_id
    select @paidtodatestring

    Part 4:

    drop index ix_test_table on dbo.test_table

    Part 5:

    declare @id integer
    declare @paidtodatestring varchar(max)
    set @id = 2
    select @paidtodatestring = ''
    select @paidtodatestring = @paidtodatestring +
    e.paidtodate_code + replicate(' ',(3-len(e.paidtodate_code)))+  
    replicate('0',(11-len(REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')))) + REPLACE(CAST(e.paidtodate_amount AS VARCHAR(11)),'.','')
    from dbo.test_table e
    where e.id = @id
    order by e.paidtodate_id
    select @paidtodatestring


    Victor Ambruso

    Friday, December 07, 2012 3:07 PM

Answers

  • Hello Victor,

    This is actually how it's supposed to work based on scalar variable assignment. What is happening is that the last row in a result set is the value that is returned when using variable assignment across a set. Since your query returns a set (there are actually 6 rows returned) the last row is actually what is returned. In the above case, depending on how the query is executed it will return different results based on the VARIABLE but not based on the result set. Thus SQL Server is still returning the identitcal result set, but the logic behind the scalar variable assignment is not correct (i.e. assuming that sql server will do it at the end of the execution plan and iterate across all of the rows).

    In your case, above, the sort operator is moving from right after the table scan of the heap to the end of the result set right before the select once the index is added into the mix. This, I'm not sure why it happens - I'm not that deep into the optimizer yet. If you do look though, you'll see it's doing exactly what you ask of it, the blame lies on the assumption that all of the rows are computed at the end in a row by row basis for the variable which is correctly returning the last item in the result set.

    See this demo where I added @I which serves to show how - based on the execution plan - SQL Server uses the inputs of logical operators to others. In this case, when ordered, row 'Three' is the "last" row in the result set and it correctly shows this.

    CREATE TABLE test
    (
    ID	INT NOT NULL,
    TheMessage	VARCHAR(100) NOT NULL
    );
    GO
    
    INSERT INTO test(ID, TheMessage)
    VALUES
    (1,'First'),
    (2,'Second'),
    (1,'Third'),
    (2,'Fourth'),
    (1,'Fifth'),
    (2,'Sixth'),
    (1,'Seventh')
    GO
    
    -- should work fine
    DECLARE @TotalMessage VARCHAR(1000)
    DECLARE @I INT
    SET @I = 0
    SET @TotalMessage = 'Start!'
    SELECT @TotalMessage = @TotalMessage + REPLACE(TheMessage, 'Z', '@') + '!', @I += 1
    FROM test
    WHERE ID = 1
    SELECT @TotalMessage, @I
    GO
    
    --should still work ok with sorted first 
    DECLARE @TotalMessage VARCHAR(1000)
    DECLARE @I INT
    SET @I = 0
    SET @TotalMessage = 'Start!'
    SELECT @TotalMessage = @TotalMessage + REPLACE(TheMessage, 'Z', '@') + '!', @I += 1
    FROM test
    WHERE ID = 1
    ORDER BY TheMessage
    SELECT @TotalMessage, @I
    GO
    
    --shouldn't work
    CREATE NONCLUSTERED INDEX IX_1 ON test(ID)
    GO
    DECLARE @TotalMessage VARCHAR(1000)
    DECLARE @I INT
    SET @I = 0
    SET @TotalMessage = 'Start!'
    SELECT @TotalMessage = @TotalMessage + REPLACE(TheMessage, 'Z', '@') + '!', @I += 1
    FROM test
    WHERE ID = 1
    ORDER BY TheMessage
    SELECT @TotalMessage, @I
    GO
    
    --still works though do to where the sort operator
    -- is and how the scalar values are computed
    -- since the last sorted value will be the resultant
    -- scalar overall @TotalMessage value based on
    -- how variable assignment works
    DECLARE @TotalMessage VARCHAR(1000)
    DECLARE @I INT
    SET @I = 0
    SET @TotalMessage = 'Start!'
    SELECT @TotalMessage = @TotalMessage + REPLACE(TheMessage, 'Z', '@') + '!', @I += 1
    FROM test
    WHERE ID = 1
    SELECT @TotalMessage, @I
    GO
    
    --drop index
    DROP INDEX IX_1 ON Test
    GO

    -Sean


    Sean Gallardy | Blog | Twitter

    Saturday, December 08, 2012 2:43 AM
    Answerer