Update Multiple Rows Using Two Tables
-
Thursday, February 14, 2013 9:43 PM
I'm actually trying to create a function in order to use it as a formula for a computed column. Eseentially here's the gist:
The table I need to apply the computed column to has two columns of interest: INVOICE_DATE and INVOICE_DUE_DATE. INVOICE_DATE is already known. I need to calculate the INVOICE_DUE_DATE. Now the INVOICE_DUE_DATE is based on another column in the INVOICE table - the CL_CLODE table.
In the CLIENT table there is a CL_CODE - the primary key. There is also a CLIENT_TERMS column. This is an int (i.e. 15, 30, 45). This column gives the terms for each client. What I need to do is depending on what value is in the CL_CODE column in the INVOICE table, is to add the CL_TERMS value to the INVOICE_DATE value to get the due date:
Here is the function I came up with:
ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5)) RETURNS DATE --WITH SCHEMABINDING AS BEGIN RETURN (select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = @CL_CODE) END
It compiles fine, but when I use it in the column, I get the subquery returns multiple rows error. If I run this standalone statement, I get the right values just fine:
select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = sdi.cl_code
Not sure how I would go about this. I can't do an IN...
Thanks@
A. M. Robinson
All Replies
-
Thursday, February 14, 2013 10:03 PM
are you saying to can't do an inner join? I don't know enough to say it can't be done this way but
that is how I would try. Here is a sample that you can apply your calculation to:
UPDATE MyFirstTable SET MyFirstTable.Calculated = first.Value1 + second.Value2 FROM myFirstTable first INNER JOIN myOtherTable second ON first.cl_code = second.cl_codeDoes this help?
countryStyle
- Marked As Answer by ansonee Monday, February 18, 2013 4:34 AM
-
Thursday, February 14, 2013 10:08 PM
The standalone update would work fine if I were doing a one time update, but what I need to do is create a computed column in a table that will continually have rows inserted into it. You can't do an UPDATE in a function or a computed column...
Running a mass update on that table every night would just be inefficient.
A. M. Robinson
-
Thursday, February 14, 2013 10:11 PMModeratorYou can add SELECT TOP (1) to always return just one row.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Thursday, February 14, 2013 10:25 PM
SELECT TOP(1) somewhat helps, but the results are pretty skewed. For example, we have a client TGT who has terms of 15 days. They have an invoice dated 9/20/2011. When I run my standalone SELECT, the due date is calculated properly - 10/5/2011.
But when I look at the date in my computed column which uses my function, the date actually goes BACK in time - 9/16/2011!
Here is the function:
ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5)) RETURNS DATE --WITH SCHEMABINDING AS BEGIN RETURN (select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = @CL_CODE) END
A. M. Robinson
-
Thursday, February 14, 2013 11:13 PM
Your function just looks wrong to me:
select CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101)
from staging_dim_client sdc, staging_dim_ar_invoice sdi
where sdc.cl_code = @CL_CODEThis can only work out if both tables are one-row tables.
In any case, I am not particularly happy with a computed column based on a UDF that performs dat access. If multiple rows are retrieved - and that includes when the colunm is used in WHERE clauses - this can be quite slow.
I think you should look into a view instead.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Friday, February 15, 2013 1:38 AM
-
Friday, February 15, 2013 1:39 AMModerator
Erland is right in his observations. Your UDF doesn't make much sense (as I now looked closer at it).
Can you explain what you were trying to do?
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Friday, February 15, 2013 3:14 PM
The function reads a row in a table to get the CL_CODE. It then finds the corresponding CL_CODE in a lookup table. Based on the CL_CODE it performs the DATEADD function to a value in the first table.
For example Table 1 has invoice, invoice_date, and, invoice_due_date, and cl_code. Table 2 has a cl_code column and a cl_terms column.
I want to, for each invoice in Table 1, look up the cl_code in table 2 and add the cl_terms value to the invoice date in TAble 1.
For example, in Table 1 there is an invoice number 12345. The cl_code is TGT. The invoice_date is 1/1/2013.
In Table 2, cl_code TGT has cl_terms of 30 days. I want to update invoice_due_date in Table 1 to 2/1/2013 (adding 30 days to 1/1/2013).
Here is some sample data:
INVOICE_NUMBER CLIENT_CODE CLIENT_TERMS INVOICE_DATE INVOICE_DUE_DATE 43872 car 30 2011-09-01 2011-10-01 43873 ftb 30 2011-09-01 2011-10-01 43874 ftb 30 2011-09-01 2011-10-01 43875 ftb 30 2011-09-01 2011-10-01 43876 ftb 30 2011-09-01 2011-10-01 43877 BBY 60 2011-09-02 2011-11-01 43878 cof 15 2011-09-02 2011-09-17 43879 cof 15 2011-09-02 2011-09-17 43880 tgt 15 2011-09-02 2011-09-17 43881 tgt 15 2011-09-02 2011-09-17
This is the query I ran to get the results:
select sdi.ar_inv_nbr as INVOICE_NUMBER, sdi.cl_code as CLIENT_CODE, sdc.CL_TERMS AS CLIENT_TERMS, sdi.ar_inv_date as INVOICE_DATE, CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) AS INVOICE_DUE_DATE from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = sdi.cl_code
What I'm trying to do is create a computed column so that every time a new invoice is entered, the due date is automatically calculated, without haveing to do an update on the table.
A. M. Robinson
- Edited by ansonee Friday, February 15, 2013 4:14 PM more info
-
Friday, February 15, 2013 5:10 PMModeratorComputed column which is UDF based is a bad idea. You better use a trigger for the update.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Sunday, February 17, 2013 2:13 PM
I've got several other tables that have computed columns making use of a UDF and the performance is fine. I need to use a UDF since the column value is based on values from two or three different tables, and a computed column can only reference the table it's in.
Also, these computed columns are in staging tables for a data warehouse that are being loaded via a data flow task, which doesn't fire triggers.
Or is there a way to get triggers to fire during a data flow task?
A. M. Robinson
-
Sunday, February 17, 2013 3:01 PM
Your function as written
ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5)) RETURNS DATE --WITH SCHEMABINDING AS BEGIN RETURN (select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = @CL_CODE) END
does not have any relationship between the sdc table and the sdi table. That effectively makes this a cross join rather than an inner join. You need to specify the relationship between the tables, either
ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5)) RETURNS DATE --WITH SCHEMABINDING AS BEGIN RETURN (select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc, staging_dim_ar_invoice sdi where sdc.cl_code = @CL_CODE AND sdc.cl_code = sdi.cl_code) END
or, and this would be my preferred practice for joining tables, use the JOIN syntax.
ALTER FUNCTION [dbo].[InvDueDate](@CL_CODE VARCHAR(5)) RETURNS DATE --WITH SCHEMABINDING AS BEGIN RETURN (select top(1) CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) from staging_dim_client sdc INNER JOIN staging_dim_ar_invoice sdi ON sdc.cl_code = sdi.cl_code where sdc.cl_code = @CL_CODE) END
It has been my experience that people who use the INNER JOIN syntax make this sort of error less often since the syntax forces you to specify the ON condition.
Tom
-
Sunday, February 17, 2013 4:50 PM
I've got several other tables that have computed columns making use of a UDF and the performance is fine.
It's alright (at least decently alright) to use a scalar T-SQL UDF in a computed column as long as it does not perform data access. Below is a repro that demonstrates the effect of a computed column that includes data access to another table. The problem is that scalar functions are never inlined, but they are called for every row, which precludes the optimizer from optimizing.
I need to use a UDF since the column value is based on values from two or three different tables, and a computed column can only reference the table it's in.
As you see in the repro below, you can use a view instead,
Also, these computed columns are in staging tables for a data warehouse that are being loaded via a data flow task, which doesn't fire triggers.
Or is there a way to get triggers to fire during a data flow task?I don't know SSIS, but generally with bulk-load operations, triggers do not fire by default, but you can request this to happen. However, if a view is not desireble, I would suggest that you leave this column NULL, and then assign thhe value in a subsequent step after the data-flow task.
Here is a repro that shows the effect of a computed column that uses a UDF with data access. Everything down to the dashed line is just a setup to fill a big table. Between the dashed lines, I run the saem SELECT both from a view and from a table with a conputed column. I use SELECT INTO, to avoid that rendering in SSMS affacts the timings.
CREATE TABLE objects (object_id int NOT NULL,
name sysname NOT NULL,
CONSTRAINT pk_objects PRIMARY KEY (object_id)
)
go
CREATE FUNCTION get_objname (@object_id int) RETURNS sysname
WITH SCHEMABINDING AS
BEGIN
RETURN (SELECT name FROM dbo.objects WHERE object_id = @object_id)
END
go
CREATE TABLE columns (object_id int NOT NULL,
column_id bigint NOT NULL,
object_name AS dbo.get_objname(object_id),
CONSTRAINT pk_columns PRIMARY KEY (object_id, column_id),
CONSTRAINT fk_columns_objects FOREIGN KEY (object_id)
REFERENCES objects(object_id)
)
go
CREATE VIEW columns_v AS
SELECT c.object_id, c.column_id, o.name as object_name
FROM columns c
JOIN objects o ON c.object_id = c.object_id
go
INSERT objects (object_id, name)
SELECT object_id, name
FROM sys.objects o
WHERE EXISTS (SELECT *
FROM sys.columns c
WHERE o.object_id = c.object_id)
go
INSERT columns (object_id, column_id)
SELECT a.object_id, row_number() OVER (PARTITION BY a.object_id ORDER BY (SELECT 1))
FROM sys.columns a
CROSS JOIN sys.columns b
WHERE a.column_id < 1000go
------------- Real test starts here --------------
DECLARE @d datetime2(3) = sysdatetime()
SELECT object_name, COUNT(*) as cnt
INTO #temp1 A
FROM columns_v
GROUP BY object_name
PRINT 'SELECT from view took ' + ltrim(str(datediff(ms, @d, sysdatetime()))) + ' ms.'
DROP TABLE #temp1
go
DECLARE @d datetime2(3) = sysdatetime()
SELECT object_name, COUNT(*) as cnt
INTO #temp1
FROM columns
GROUP BY object_name
PRINT 'SELECT from table took ' + ltrim(str(datediff(ms, @d, sysdatetime()))) + ' ms.'
DROP TABLE #temp1
go
-------------------- Clean up ---------------------
DROP TABLE columns
DROP FUNCTION get_objname
DROP TABLE objects
DROP VIEW columns_vThis was the output I got:
(54 row(s) affected)
(242064 row(s) affected)
(54 row(s) affected)
SELECT from view took 28 ms.(54 row(s) affected)
SELECT from table took 2348 ms.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se -
Monday, February 18, 2013 4:12 AM
However, if a view is not desireble, I would suggest that you leave this column NULL, and then assign thhe value in a subsequent step after the data-flow task.
That was my other option I mentioned intitially in my post, which was why I provided the SQL I used to just do a SELECT and got the results I I was expecting. What I need to know is how can I take that original SQL statement and use it to update each row, based on that row's CL_CODE value.
I have no problem with doing the update after the load, I just am having issues with the SQL for doing it. THe only value the two tables have in common is the CL_CODE. As I mentioned before, when using SELECT TOP(), the wrong values are returned. In one of my earlier repies, I posted the results of using SELECT TOP(). The values returned were way off.
A. M. Robinson
-
Monday, February 18, 2013 4:34 AM
Just as I hit ENTER, I got the answer:
UPDATE sdi SET sdi.ar_inv_due_date = CONVERT(DATE, DATEADD(dd, sdc.cl_terms, sdi.ar_inv_date), 101) FROM staging_dim_ar_invoice sdi INNER JOIN staging_dim_client sdc ON sdi.cl_code = sdc.cl_code
Thanks for the help everyone!!A. M. Robinson
-
Monday, February 18, 2013 4:36 AMThanks Tom...now that I look at yours, I completely realize what I was missing...!
A. M. Robinson

