Saturday, January 05, 2013 12:23 AM
We have about 20 source tables. There are columns in those source tables which may contain bad data and cause package to fail. I'm looking for some advice on how to capture bad rows on any table and redirect them to ONE generic table (because otherwise I'd have to add error dump tables for each source table and that gets out of hand very quickly,) and do this in a performant manner.
I'd like to do this with code, that is a custom component or script.
Saturday, January 05, 2013 1:12 AM
You probably can achieve it by same target table instead of creating a new table. For all the incoming rows, you will test data quality of each column, redirect them to separate data path for any issues and mark the record as "bad" by making "bad" flag as True and by also mentioning the reason in some another column e.g. input data is not an integer (say). By the end of data processing you will have set of good records having "bad" flag as False and set of bad/junk records having "bad" flag as True. For further processing you simply can take records having "bad" flag as False. For data cleansing at source you will also have explanations that you can code in your SSIS package easily and send back to the source owner. In short each table will have a flag to distinguish good/bad record and also the explanation why it is bad. This model works well when you have varchar/nvarchar datatype for all the columns in the target table.
If you want to implement this design using separate table, I'd say, combine all the bad records using UNION ALL and then put it in the table which holds all the bad data. It'd be good to specify the reason of the error as well. So basically, if for a row there is an issue in a particular column you will redirect it to the UNION ALL, similarly if some other record has issue with some different column that would be handled in different part of your data flow. At the last you will combine all the Junk rows and push them into your table.
Saturday, January 05, 2013 2:02 PMThanks for your reply. I see what you are saying, but unfortunate we cannot mix bad data with good in our table and we cannot change the schema either.
Sunday, January 06, 2013 5:36 AM
You can simply redirect the error rows and union them and dump it into one table.
Are bad rows means some datatype mismatch or something ?
Please clear bit more..?
Sunday, January 06, 2013 7:13 AM
try this link for similar example - http://sathyadb.blogspot.in/2012/09/sql-server-integration-services-error.html
Thanks & Regards, sathya
Sunday, January 06, 2013 7:25 PMModerator
you can use CONFIGURE ERROR OUTPUT in source to extract bad data rows and redirect them into ole db destination, then connect ole db destination to the specific table as you want.. you can put table name in that logging table with Derived Column Transform as well.
here you can see a sample of handling bad data rows in destination (you can implement same for source):
- Marked As Answer by Eileen ZhaoMicrosoft Contingent Staff, Moderator Friday, January 11, 2013 7:32 AM