Update Multiple Rows Using Two Tables

Traitée Update Multiple Rows Using Two Tables

  • Thursday, February 14, 2013 9:43 PM
     
      Has Code

    I'm actually trying to create a function in order to use it as a formula for a computed column. Eseentially here's the gist:

    The table I need to apply the computed column to has two columns of interest: INVOICE_DATE and INVOICE_DUE_DATE. INVOICE_DATE is already known. I need to calculate the INVOICE_DUE_DATE. Now the INVOICE_DUE_DATE is based on another column in the INVOICE table - the CL_CLODE table.

    In the CLIENT table there is a CL_CODE - the primary key. There is also a CLIENT_TERMS column. This is an int (i.e. 15, 30, 45). This column gives the terms for each client. What I need to do is depending on what value is in the CL_CODE column in the INVOICE table, is to add the CL_TERMS value to the INVOICE_DATE value to get the due date:

    Here is the function I came up with:

    ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5))
    RETURNS DATE
    --WITH SCHEMABINDING
    AS
    	BEGIN
    		RETURN
    		(select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = @CL_CODE)
     
    	END

    It compiles fine, but when I use it in the column, I get the subquery returns multiple rows error. If I run this standalone statement, I get the right values just fine:

    select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = sdi.cl_code

    Not sure how I would go about this. I can't do an IN...

    Thanks@


    A. M. Robinson

All Replies

  • Thursday, February 14, 2013 10:03 PM
     
     Answered Has Code

    are you saying to can't do an inner join?  I don't know enough to say it can't be done this way but

    that is how I would try.  Here is a sample that you can apply your calculation to:

    UPDATE
        MyFirstTable
    SET
        MyFirstTable.Calculated = first.Value1 + second.Value2
    FROM
        myFirstTable first
    INNER JOIN
        myOtherTable second
    ON 
        first.cl_code = second.cl_code

    Does this help?

    countryStyle

    • Marked As Answer by ansonee Monday, February 18, 2013 4:34 AM
    •  
  • Thursday, February 14, 2013 10:08 PM
     
     

    The standalone update would work fine if I were doing a one time update, but what I need to do is create a computed column in a table that will continually have rows inserted into it. You can't do an UPDATE in a function or a computed column...

    Running a mass update on that table every night would just be inefficient.


    A. M. Robinson

  • Thursday, February 14, 2013 10:11 PM
    Moderator
     
     
    You can add SELECT TOP (1) to always return just one row.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

  • Thursday, February 14, 2013 10:25 PM
     
      Has Code

    SELECT TOP(1) somewhat helps, but the results are pretty skewed. For example, we have a client TGT who has terms of 15 days. They have an invoice dated 9/20/2011. When I run my standalone SELECT, the due date is calculated properly - 10/5/2011. 

    But when I look at the date in my computed column which uses my function, the date actually goes BACK in time - 9/16/2011!

    Here is the function:

    ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5))
    RETURNS DATE
    --WITH SCHEMABINDING
    AS
    	BEGIN
    		RETURN
    		(select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = @CL_CODE)
     
    	END


    A. M. Robinson

  • Thursday, February 14, 2013 11:13 PM
     
     Proposed

    Your function just looks wrong to me:

    select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = @CL_CODE

    This can only work out if both tables are one-row tables.

    In any case, I am not particularly happy with a computed column based on a UDF that performs dat access. If multiple rows are retrieved - and that includes when the colunm is used in WHERE clauses - this can be quite slow.

    I think you should look into a view instead.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
  • Friday, February 15, 2013 1:39 AM
    Moderator
     
     

    Erland is right in his observations. Your UDF doesn't make much sense (as I now looked closer at it).

    Can you explain what you were trying to do?


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

  • Friday, February 15, 2013 3:14 PM
     
      Has Code

    The function reads a row in a table to get the CL_CODE. It then finds the corresponding CL_CODE in a lookup table. Based on the CL_CODE it performs the DATEADD function to a value in the first table.

    For example Table 1 has invoice, invoice_date, and, invoice_due_date, and cl_code. Table 2 has a cl_code column and a cl_terms column.

    I want to, for each invoice in Table 1, look up the cl_code in table 2 and add the cl_terms value to the invoice date in TAble 1.

    For example, in Table 1 there is an invoice number 12345. The cl_code is TGT. The invoice_date is 1/1/2013.

    In Table 2, cl_code TGT has cl_terms of 30 days. I want to update invoice_due_date in Table 1 to 2/1/2013 (adding 30 days to 1/1/2013).

    Here is some sample data:

    INVOICE_NUMBER	CLIENT_CODE	CLIENT_TERMS	INVOICE_DATE	INVOICE_DUE_DATE
    43872		car		30		2011-09-01	2011-10-01
    43873		ftb		30		2011-09-01	2011-10-01
    43874		ftb		30		2011-09-01	2011-10-01
    43875		ftb		30		2011-09-01	2011-10-01
    43876		ftb		30		2011-09-01	2011-10-01
    43877		BBY		60		2011-09-02	2011-11-01
    43878		cof		15		2011-09-02	2011-09-17
    43879		cof		15		2011-09-02	2011-09-17
    43880		tgt		15		2011-09-02	2011-09-17
    43881		tgt		15		2011-09-02	2011-09-17

    This is the query I ran to get the results:

    select  
      sdi.ar_inv_nbr as INVOICE_NUMBER,
      sdi.cl_code as CLIENT_CODE,
      sdc.CL_TERMS AS CLIENT_TERMS,
      sdi.ar_inv_date as INVOICE_DATE,
      CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) AS INVOICE_DUE_DATE
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = sdi.cl_code
    What I'm trying to do is create a computed column so that every time a new invoice is entered, the due date is automatically calculated, without haveing to do an update on the table.


    A. M. Robinson


    • Edited by ansonee Friday, February 15, 2013 4:14 PM more info
    •  
  • Friday, February 15, 2013 5:10 PM
    Moderator
     
     
    Computed column which is UDF based is a bad idea. You better use a trigger for the update.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

  • Sunday, February 17, 2013 2:13 PM
     
     

    I've got several other tables that have computed columns making use of a UDF and the performance is fine. I need to use a UDF since the column value is based on values from two or three different tables, and a computed column can only reference the table it's in.

    Also, these computed columns are in staging tables for a data warehouse that are being loaded via a data flow task, which doesn't fire triggers. 

    Or is there a way to get triggers to fire during a data flow task?


    A. M. Robinson

  • Sunday, February 17, 2013 3:01 PM
     
      Has Code

    Your function as written

    ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5))
    RETURNS DATE
    --WITH SCHEMABINDING
    AS
    	BEGIN
    		RETURN
    		(select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = @CL_CODE)
     
    	END

    does not have any relationship between the sdc table and the sdi table.  That effectively makes this a cross join rather than an inner join.  You need to specify the relationship between the tables, either

    ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5))
    RETURNS DATE
    --WITH SCHEMABINDING
    AS
    	BEGIN
    		RETURN
    		(select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc, staging_dim_ar_invoice sdi
      where sdc.cl_code = @CL_CODE
      AND sdc.cl_code = sdi.cl_code)
     
    	END

    or, and this would be my preferred practice for joining tables, use the JOIN syntax.

    ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5))
    RETURNS DATE
    --WITH SCHEMABINDING
    AS
    	BEGIN
    		RETURN
    		(select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
      from staging_dim_client sdc 
    INNER JOIN staging_dim_ar_invoice sdi ON sdc.cl_code = sdi.cl_code
      where sdc.cl_code = @CL_CODE)
     
    	END

    It has been my experience that people who use the INNER JOIN syntax make this sort of error less often since the syntax forces you to specify the ON condition.

    Tom

  • Sunday, February 17, 2013 4:50 PM
     
     

    I've got several other tables that have computed columns making use of a UDF and the performance is fine.

    It's alright (at least decently alright) to use a scalar T-SQL UDF in a computed column as long as it does not perform data access. Below is a repro that demonstrates the effect of a computed column that includes data access to another table. The problem is that scalar functions are never inlined, but they are called for every row, which precludes the optimizer from optimizing.

    I need to use a UDF since the column value is based on values from two or three different tables, and a computed column can only reference the table it's in.

    As you see in the repro below, you can use a view instead,

    Also, these computed columns are in staging tables for a data warehouse that are being loaded via a data flow task, which doesn't fire triggers. 

    Or is there a way to get triggers to fire during a data flow task?

    I don't know SSIS, but generally with bulk-load operations, triggers do not fire by default, but you can request this to happen. However, if a view is not desireble, I would suggest that you leave this column NULL, and then assign thhe value in a subsequent step after the data-flow task.

    Here is a repro that shows the effect of a computed column that uses a UDF with data access. Everything down to the dashed line is just a setup to fill a big table. Between the dashed lines, I run the saem SELECT both from a view and from a table with a conputed column. I use SELECT INTO, to avoid that rendering in SSMS affacts the timings.

    CREATE TABLE objects (object_id int NOT NULL,
                          name      sysname NOT NULL,
          CONSTRAINT pk_objects PRIMARY KEY (object_id)
    )
    go
    CREATE FUNCTION get_objname (@object_id int) RETURNS sysname
    WITH SCHEMABINDING AS
    BEGIN
       RETURN (SELECT name FROM dbo.objects WHERE object_id = @object_id)
    END
    go
    CREATE TABLE columns (object_id   int    NOT NULL,
                          column_id   bigint NOT NULL,
                          object_name AS dbo.get_objname(object_id),
       CONSTRAINT pk_columns PRIMARY KEY (object_id, column_id),
       CONSTRAINT fk_columns_objects FOREIGN KEY (object_id)
          REFERENCES objects(object_id)
    )
    go
    CREATE VIEW columns_v AS
        SELECT c.object_id, c.column_id, o.name as object_name
        FROM   columns c
        JOIN   objects o ON c.object_id = c.object_id
    go
    INSERT objects (object_id, name)
      SELECT object_id, name
      FROM   sys.objects o
      WHERE  EXISTS (SELECT *
                     FROM   sys.columns c
                     WHERE  o.object_id = c.object_id)
    go
    INSERT columns (object_id, column_id)
       SELECT a.object_id, row_number() OVER (PARTITION BY a.object_id ORDER BY (SELECT 1))
       FROM  sys.columns a
       CROSS JOIN  sys.columns b
       WHERE a.column_id < 1000

    go
    ------------- Real test starts here --------------
    DECLARE @d datetime2(3) = sysdatetime()
    SELECT object_name, COUNT(*) as cnt
    INTO  #temp1                                                                                                                                     A
    FROM   columns_v
    GROUP  BY object_name
    PRINT 'SELECT from view took ' + ltrim(str(datediff(ms, @d, sysdatetime()))) + ' ms.'
    DROP TABLE #temp1
    go
    DECLARE @d datetime2(3) = sysdatetime()
    SELECT object_name, COUNT(*) as cnt
    INTO   #temp1
    FROM   columns
    GROUP  BY object_name
    PRINT 'SELECT from table took ' + ltrim(str(datediff(ms, @d, sysdatetime()))) + ' ms.'
    DROP TABLE #temp1
    go
    -------------------- Clean up ---------------------
    DROP TABLE columns
    DROP FUNCTION get_objname
    DROP TABLE objects
    DROP VIEW columns_v

    This was the output I got:

    (54 row(s) affected)

    (242064 row(s) affected)

    (54 row(s) affected)
    SELECT from view took 28 ms.

    (54 row(s) affected)
    SELECT from table took 2348 ms.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
  • Monday, February 18, 2013 4:12 AM
     
      Has Code
    
    However, if a view is not desireble, I would suggest that you leave this column NULL, and then assign thhe value in a subsequent step after the data-flow task.

    That was my other option I mentioned intitially in my post, which was why I provided the SQL I used to just do a SELECT and got the results I I was expecting. What I need to know is how can I take that original SQL statement and use it to update each row, based on that row's CL_CODE value.

    I have no problem with doing the update after the load, I just am having issues with the SQL for doing it. THe only value the two tables have in common is the CL_CODE. As I mentioned before, when using SELECT TOP(), the wrong values are returned. In one of my earlier repies, I posted the results of using SELECT TOP(). The values returned were way off.


    A. M. Robinson

  • Monday, February 18, 2013 4:34 AM
     
      Has Code

    Just as I hit ENTER, I got the answer:

    UPDATE
        sdi
    SET
        sdi.ar_inv_due_date = CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) 
    FROM
        staging_dim_ar_invoice sdi
    INNER JOIN
        staging_dim_client sdc 
    ON 
        sdi.cl_code = sdc.cl_code

    Thanks for the help everyone!!

    A. M. Robinson

  • Monday, February 18, 2013 4:36 AM
     
     
    Thanks Tom...now that I look at yours, I completely realize what I was missing...!

    A. M. Robinson