none
mass replace of DB codes RRS feed

  • Question

  • Hi,

    I have a column in a table that contains variable semi-colon DB codes such as:

    ;code1;code2;

    ;code2;

    ;code1;code3;code4;

    A separate table contains the text for each code.

    I have a scalar function that converts them to

    PlainText1; PlainText2

    PlainText2

    PlainText1; PlainText3; PlaintText4

    However the performance is terrible on large rows.

    What query could I write that doesn't use a scalar function?

    Thanks

    Saturday, July 20, 2019 3:09 PM

All replies

  • It would be best to follow relational database design best practices (atomic data in each column and no repeating groups) rather than store delimited lists. Violating these database normalization rules often results in poor performance when using a relational database.

    With your existing design, you could use CROSS APPLY and a table-valued function to return each value as a row and then join to the other table. This may bring the query duration down to an acceptable time despite the flawed design.


    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Saturday, July 20, 2019 3:38 PM
  • What query could I write that doesn't use a scalar function?

    Try

    -- code #1 v2
    with 
    DBCodes_rows as (
    SELECT T.key as ID, SS.ItemNumber as seq, SS.Item
      from tbDBcodes as T
           cross apply dbo.DelimitedSplit8K (substring (T.codes, 2, len (T.codes) -2), ';') as SS
    --where ... ), DBPlainText_rows as ( SELECT C.ID, C.seq, coalesce (T.text, '?') as plaintext from DBCodes_rows as C left join tbTextCode as T on T.code = C.Item ) SELECT T1.ID, stuff ((SELECT ';' + T2.plaintext from DBPlainText_rows as T2 where T2.ID = T1.ID order by T2.seq for xml path('')), 1, 0,'') + ';' as plaintexts from DBPlainText_rows as T1 group by T1.ID;

    I didn't test it; it may contain error(s).

    tbDBCodes: name of the table that only contains code lists.
       key: column that uniquely identifies each row
       codes: column containing the code list 

    tbTextCode: name of the table containing the text of each code.
      code:
      text:

    The source code of the DelimitedSplit8K function is in the article Tally OH! An Improved SQL 8K CSV Splitter Function (by Jeff Moden), figure 21.

    ---

    The STRING_SPLIT() function, available from the 2016 SQL Server version, is not reliable for cases in which the values must be returned in the same original position.

     

    Remember to mark/vote this answer if it helped you.


    José Diz     Belo Horizonte, MG - Brasil     [T-SQL performance tuning: Porto SQL]


    Este conteúdo é fornecido sem garantias de qualquer tipo, seja expressa ou implícita.

    • Edited by José Diz Sunday, July 21, 2019 10:07 AM
    Saturday, July 20, 2019 4:24 PM
  • Hi,

    Which version of sql server? If 2014 or higher then you can use string split function to get the codes and then join with table containing texts for codes. Use String agg function to join if necessary.

    Rgds,

    Soumen


    Sunday, July 21, 2019 3:20 AM
  • Hi,
    Which version of sql server? If 2014 or higher then you can use string split function to get the codes and then join with table containing texts for codes. Use String agg function to join if necessary.
    Rgds,
    Soumen

    The STRING_SPLIT() function, available from the 2016 SQL Server version, is not reliable for cases in which the values must be returned in the same original position.



    José Diz     Belo Horizonte, MG - Brasil     [T-SQL performance tuning: Porto SQL]




    • Edited by José Diz Sunday, July 21, 2019 12:26 PM
    Sunday, July 21, 2019 9:54 AM
  • The STRING_SPLIT() function, available from the 2016 SQL Server version, is not reliable for cases in which the values must be returned in the same original position. 

    I don't believe ordinal position is significant here so STRING_SPLIT might be an option in SQL 2016 and later.

    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Sunday, July 21, 2019 12:54 PM
  • If you have a limited number of codes need to be converted, multiple hard-coded REPLACE function should work better.
    Monday, July 22, 2019 2:01 PM
    Moderator