Asked by:
Need help writing SQL Query - Capture rows with issues and load the remaining rows into TargetTable

Question
-
CREATE TABLE #SourceTable (ID INT, Product VARCHAR(100), Qty int , ErrorMessage varchar(2000))
CREATE TABLE #TargetTable (ID INT, Product VARCHAR(6), Qty int not null)
INSERT INTO #SourceTable (ID, Product, Qty)
SELECT 1, 'Bike' , 10 union all
SELECT 10, 'Computer', 8 union all
SELECT 18, 'Phone' , 5 union all
SELECT 23, 'Tablet' , 45 union all
SELECT 100, 'Car', NULL
SELECT * FROM #SourceTable
I have to load data from SourceTable to Target table everyday but the load should not fail even if load has issues with few source rows.
In this example, i have 5 rows that i have to load into Target table.
By looking at data we know the row with ID=10 will fail because the length of Product is big in Source but its less in Target.
Also, Row with ID=100 will fail because the Qty is null in Source table but its Not NULL column in Target table.
Because of these 2 rows the load should not fail. The remaining 3 rows should make it to Target table and for the 2 rows with issues, update the SourceTable with Error messages it might have failed during load in ErrorMessage column in the same SourceTable
This is just small table i gave as an example, but my case i have few columns and heavy data. There might be many rows which might cause the load to fail.
Not in SSIS. This should be in SQL Query like Insert into Target and Select from Source Table
Friday, December 21, 2018 9:21 PM
All replies
-
CREATE TABLE #SourceTable (ID INT, Product VARCHAR(100), Qty int , ErrorMessage varchar(2000)) CREATE TABLE #TargetTable (ID INT, Product VARCHAR(6), Qty int not null) INSERT INTO #SourceTable (ID, Product, Qty) SELECT 1, 'Bike' , 10 union all SELECT 10, 'Computer', 8 union all SELECT 18, 'Phone' , 5 union all SELECT 23, 'Tablet' , 45 union all SELECT 100, 'Car', NULL Update #SourceTable Set ErrorMessage='truncation occurs' WHERE Len(Product)>6; Update #SourceTable Set ErrorMessage=' Qty is null' WHERE Qty is null Merge #TargetTable t using #SourceTable s on s.ID=t.ID WHEN NOT MATCHED and ErrorMessage is null then Insert (ID, Product, Qty) Values(s.ID, S.Product, s.Qty ) ; Select * from #TargetTable SELECT * FROM #SourceTable drop table #SourceTable,#TargetTable
Friday, December 21, 2018 9:54 PM -
Thanks. But I have to capture the errors that occurred during the load process and update in Errormessage column. There might be plenty of reasons the load might fail for few rows. And even one row can fail with issues with data in two or more columns and if one row fails for many reasons then put all messsages concatenated in the Errormessage columns for same row.
My real source table has around 15 columns and error can occur with data in one or more columns for same row.- Edited by Leo00 Friday, December 21, 2018 10:13 PM
Friday, December 21, 2018 10:11 PM -
If you want multiple error message for the same row, change it to:
Update #SourceTable Set ErrorMessage = concat(ErrorMessage, ' Qty is null') WHERE Qty is null
If you want to trap errors that occurs when you actually load the data, the is run the INSERT/UPDATE/MERGE or whatever you do, that is not possible. Each statement is atomic, so if one row fails, all other rows are rolled back.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
- Proposed as answer by Naomi N Sunday, December 23, 2018 2:54 AM
Friday, December 21, 2018 10:27 PM -
Thanks. But I have to capture the errors that occurred during the load process and update in Errormessage column. There might be plenty of reasons the load might fail for few rows. And even one row can fail with issues with data in two or more columns and if one row fails for many reasons then put all messsages concatenated in the Errormessage columns for same row.
My real source table has around 15 columns and error can occur with data in one or more columns for same row.you can do this using SSIS tasks or using T-SQL
Always doing this in T-SQL would be best as it would be set based logic compared to SSIS which will work in batches
In SSIS you can configure error output within data flow tasks to capture the error code and description along with column details to your error
For capturing the business logic related errors, you can use execute sql task or conditional task based logic inside data flow or even script task for doing validations
As an example see how can do this below
https://prathy.com/2017/01/validating-data-using-ssis/
Doing in T-SQL requires a procedure where you can have separate UPDATE statements to check for each condition and append the error details to the error description columns
You can also implement this using constraints and rules like below
Please Mark This As Answer if it solved your issue
Please Vote This As Helpful if it helps to solve your issue
Visakh
----------------------------
My Wiki User Page
My MSDN Page
My Personal Blog
My Facebook PageSaturday, December 22, 2018 9:32 AM -
Hi Leo00,
As Erland said that 'Each statement is atomic, so if one row fails, all other rows are rolled back' , when you insert your rows together and one of them shows error , it means the operation is failed.
If you would like to get your requirement , I suggest you to insert your rows one by one. In my following script , when you insert one row you will execute the procedure 'CatchErrorDemo' once. After you execute the procedure , you will update the error message or insert the row into target table.
Please try it and see if it satisfies your requirement.
---drop table #SourceTable ---drop table #TargetTable CREATE TABLE #SourceTable (ID INT, Product VARCHAR(100), Qty int , ErrorMessage varchar(2000)) CREATE TABLE #TargetTable (ID INT, Product VARCHAR(6), Qty int not null) INSERT INTO #SourceTable (ID, Product, Qty) SELECT 1, 'Bike' , 10 union all SELECT 10, 'Computer', 8 union all SELECT 18, 'Phone' , 5 union all SELECT 23, 'Tablet' , 45 union all SELECT 100, 'Car', NULL SELECT * FROM #SourceTable go CREATE PROCEDURE CatchErrorDemo @a int AS BEGIN TRY insert into #TargetTable select ID,Product, Qty from (select *, row_number()over(order by (select 1)) as rn from #SourceTable) t where rn =@a END TRY BEGIN CATCH update a Set a.ErrorMessage=ERROR_MESSAGE() from (select *, row_number()over(order by (select 1)) as rn from #SourceTable) t join #SourceTable a on t.ID=a.ID where t.rn =@a END CATCH; EXEC CatchErrorDemo 1; EXEC CatchErrorDemo 2; EXEC CatchErrorDemo 3; EXEC CatchErrorDemo 4; EXEC CatchErrorDemo 5; SELECT * FROM #SourceTable /* ID Product Qty ErrorMessage ----------- ---------------------------------------------------------------------------------------------------- ----------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 Bike 10 NULL 10 Computer 8 String or binary data would be truncated. 18 Phone 5 NULL 23 Tablet 45 NULL 100 Car NULL Cannot insert the value NULL into column 'Qty', table 'tempdb.dbo.#TargetTable________________________________________________________________________________________________________00000000000D'; column does not allow nulls. INSERT fails. */ SELECT * FROM #TargetTable /* ID Product Qty ----------- ------- ----------- 1 Bike 10 18 Phone 5 23 Tablet 45 */
Hope it can help you .
Best Regards,
Rachel
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.Monday, December 24, 2018 6:58 AM -
I came up with below query and its working. But as i have million of rows to load into TargetTable, can i modify this query to improve performance?
DECLARE @ID INT DECLARE @Product VARCHAR(100) DECLARE @Qty INT DECLARE @ErrorMessage VARCHAR(2000) DECLARE SourceCur CURSOR FOR SELECT ID, Product, Qty FROM #SourceTable OPEN SourceCur FETCH NEXT FROM SourceCur INTO @ID, @Product, @Qty BEGIN_OF_LOOP: BEGIN TRY WHILE @@FETCH_STATUS = 0 BEGIN INSERT INTO #TargetTable ( ID, Product, Qty ) VALUES (@ID, @Product, @Qty) FETCH NEXT FROM SourceCur INTO @ID, @Product, @Qty END END TRY BEGIN CATCH SELECT @ErrorMessage = ERROR_MESSAGE() UPDATE #SourceTable SET ErrorMessage = @ErrorMessage WHERE ID = @ID FETCH NEXT FROM SourceCur INTO @ID, @Product, @Qty GOTO BEGIN_OF_LOOP END CATCH; CLOSE SourceCur; DEALLOCATE SourceCur;
- Edited by Leo00 Wednesday, December 26, 2018 3:01 PM
Wednesday, December 26, 2018 3:00 PM -
Unfortunately, for the general case you do need to do something like that to be able to catch individual errors.
There are a few things you can do to improve performance. The first is to make the cursor STATIC LOCAL, since the default cursor type is dynamic and not very efficient. I would not expect any dramatic performance improvement in this case. But at least it is simple.
Other ways to speed things up are more complex. What you can to is to attempt to insert a couple of rows at a time, and if this fails, fall back to a loop which runs one by one. Exactly how many rows there should be a in a batch depends on how common you expect errors to be. If you expect one row in thousand to have an error, and you have a batch size of thousand, most of the batches will be wasted efforts, so in that case a batch size of 50 or 100 is better.
The technique would be to add a batchno column to the source, and first populate this row so that there are 50 (or whatever) rows with batchno = 1, 50 rows with batchno = 2 etc. When a batch files with an error, you trap that, and renumber all rows with a new batch number, and now with a batch size of 1.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Wednesday, December 26, 2018 3:50 PM