Answered by:
SSIS VB Script Issue

Question
-
I have an SSIS package where I am importing records from a gate access database into another database. Both are SQL Server databases. I use data flow components to import the records and a VB Script component to perform other calculations on the way through.
My issue is I need to identify records that I want to flag as duplicates. e.g. A User might swipe their Access Card multiple times when trying to open a gate which generates multiple records in the gate access database, but because the time between each swipe is very small I want to only count the first record and flag the remaining records for the same User within the short time period as duplicates. So, as I process a record I need to check whether it is the same Access Card as the previous record I have processed, and if so, calculate the time between the 2 record date/time fields.
I have tried using Variables to store the Card No and Date/Time of the record being processed but there are limitations on when the Variables can be updated - only in the PostExecute which I believe is after ALL records have been processed.
Does anyone have a simple way of achieving this?
Regards Kevin Seerup
GoalMaker Software Solutions Pty Ltd
www.goalmaker.com.au
Answers
-
Hi Kevin,
You can try to use LAG() and DATEDIFF() functions in the source query to identify time deltas for the same CardNo. This way you won't need a Script Component in the SSIS Data Flow.
Check it out:
-- DDL and sample data population, start DECLARE @mockTbl TABLE ( ID INT IDENTITY(1,1) PRIMARY KEY , [CardNo] VARCHAR(30) , [DateTimeProcessed] DATETIME ); INSERT INTO @mockTbl (CardNo, DateTimeProcessed) VALUES ('100', '2019-09-12 09:00:00') ,('152', '2019-09-12 09:10:00') ,('152', '2019-09-12 09:10:10') ,('160', '2019-09-12 09:12:10'); -- DDL and sample data population, end ;WITH rs AS ( SELECT * , LAG(DateTimeProcessed, 1) OVER (PARTITION BY CardNo ORDER BY DateTimeProcessed) AS LagValue FROM @mockTbl ) SELECT * , IIF(DATEDIFF(second, LagValue, DateTimeProcessed) <= 10, 1, 0) AS [duplicate] FROM rs ORDER BY id, DateTimeProcessed;
Output:ID CardNo DateTimeProcessed LagValue duplicate 1 100 2019-09-12 09:00:00.000 NULL 0 2 152 2019-09-12 09:10:00.000 NULL 0 3 152 2019-09-12 09:10:10.000 2019-09-12 09:10:00.000 1 4 160 2019-09-12 09:12:10.000 NULL 0
- Edited by Yitzhak Khabinsky Thursday, September 12, 2019 5:05 AM
- Marked as answer by GoalMaker Thursday, September 12, 2019 5:24 AM
All replies
-
-
It isn't a complex issue. Records might look like the following:
As each record is read the Card No and Time are stored.
- Read Record 1 - first record, no duplicate.
- Read Record 2 - different Card No to Record 1 - no duplicate
- Read Record 3 - same Card No as Record 2, time is within 10 seconds of Record 2 - flag as duplicate
- Read Record 4 - different Card No to Record 3 - no duplicate
Regards Kevin Seerup
GoalMaker Software Solutions Pty Ltd
www.goalmaker.com.au- Edited by GoalMaker Thursday, September 12, 2019 4:15 AM
-
-
-
Hi Kevin,
You can try to use LAG() and DATEDIFF() functions in the source query to identify time deltas for the same CardNo. This way you won't need a Script Component in the SSIS Data Flow.
Check it out:
-- DDL and sample data population, start DECLARE @mockTbl TABLE ( ID INT IDENTITY(1,1) PRIMARY KEY , [CardNo] VARCHAR(30) , [DateTimeProcessed] DATETIME ); INSERT INTO @mockTbl (CardNo, DateTimeProcessed) VALUES ('100', '2019-09-12 09:00:00') ,('152', '2019-09-12 09:10:00') ,('152', '2019-09-12 09:10:10') ,('160', '2019-09-12 09:12:10'); -- DDL and sample data population, end ;WITH rs AS ( SELECT * , LAG(DateTimeProcessed, 1) OVER (PARTITION BY CardNo ORDER BY DateTimeProcessed) AS LagValue FROM @mockTbl ) SELECT * , IIF(DATEDIFF(second, LagValue, DateTimeProcessed) <= 10, 1, 0) AS [duplicate] FROM rs ORDER BY id, DateTimeProcessed;
Output:ID CardNo DateTimeProcessed LagValue duplicate 1 100 2019-09-12 09:00:00.000 NULL 0 2 152 2019-09-12 09:10:00.000 NULL 0 3 152 2019-09-12 09:10:10.000 2019-09-12 09:10:00.000 1 4 160 2019-09-12 09:12:10.000 NULL 0
- Edited by Yitzhak Khabinsky Thursday, September 12, 2019 5:05 AM
- Marked as answer by GoalMaker Thursday, September 12, 2019 5:24 AM
-