none
Script Component - Set Operation on Recordset to Output Flattened Data with Fewer Rows RRS feed

  • Question

  • In a Data Flow Task, I am using a third-party component to produce a Recordset which I am saving to an object variable.  The Recordset looks like this (there can be up to 3 rows per ID):

    ID    ColA1    ColB1    ColC1

    101   a1        b1        c1

    101   a2        b2        c2

    101   a3        b3        c3

    102   A1        B1        C1

    102   A2        B2        C2

    The data must be flattened to:

    ID    ColA1    ColB1    ColC1    ColA2    ColB2    ColC2    ColA3    ColB3    ColC3

    101   a1         b1        c1          a2        b2         c2         a3         b3         c3

    102   A1         B1        C1         A2        B2         C2

    I have a simple SQL SELECT statement to flatten the data (it's a set operation).  Can I use the Script Component inside a Data Flow Task to use this SELECT statement to process the incoming Recordset as a set operation (instead of row-by-row) in order to output the flattened Recordset?  From what I'm reading, it appears that the Asynchronous Script Component Transformation processes row-by-row.

    Thanks,

    JP

    Monday, December 2, 2019 11:05 PM

All replies

  • Hi JP,

    We could use OLE DB Command Transformation in data flow task.

    Please refer to Update Data Using OLEDB Command Transformation in SSIS .

    Best Regards,

    Mona


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Tuesday, December 3, 2019 2:26 AM
  • Hi, Mona.  I believe the OLE DB Command operates row-by-row also, though, unless I am mistaken?
    Tuesday, December 3, 2019 2:44 PM
  • Hi jkproctor,

    You are right on target that Mona's suggestion will not work for you.

    What is your real data source that "...a third-party component..." getting data from?

    Tuesday, December 3, 2019 3:02 PM
  • We are using CozyRoc's EDI Source component to parse through an EDI file and produce several outputs that then get merged back into a flat file.  One of the outputs needs to be flattened as described above before merging into the larger flat file.  I figured there would be a relatively easy way in SSIS to use the recordset output by EDI Source and run a SQL SELECT command against it to transform it from X rows and Y columns to (roughly) X/3 rows and 3Y columns.  The SQL is written and ready, but it's a single SELECT statement which runs as a set operation, aggregating the entire recordset at once, not operating on each row individually.  So far, I have not found an easy way to do this.  I could save the data into a table in a db and run the SQL when pulling the data back out as a source for SSIS, but I'd rather avoid involving the db and just handle everything in SSIS if possible.

    Thanks,

    JP

    Tuesday, December 3, 2019 3:48 PM
  • Hi JP,

    Thanks for sharing all the  details.

    SSIS has PIVOT Transformation out-of-the-box. It seems it could be useful for your scenario.

    Check it out here: Pivot Transformation in SSIS

    Tuesday, December 3, 2019 4:12 PM
  • I looked into the Pivot Transform, but unfortunately this isn't a traditional pivot.  The Pivot Transform is looking in a particular column (e.g. Year) for specific values (2015, 2016, 2017, 2018), and these values will become new columns.  In our data, the set of original columns will be repeated thrice, the first set of columns will be populated by the first instance of an ID, the second set of columns by the second instance (if it exists) of that ID, and the third set of columns by the third instance (if it exists).  The SQL which transforms the data is (UniqueID is another column in the data which is unique for every row):

    ;with CTE as (
    select ID
    ,RANK() OVER (PARTITION BY ID ORDER BY UniqueID) AS ColSeq
    ,ColA1
    ,ColB1
    ,ColC1
    from CRC)
    ,CTE2 as (
    select ID
    ,case when ColSeq = 1 then ColA1 end as ColA1
    ,case when ColSeq = 1 then ColB1 end as ColB1
    ,case when ColSeq = 1 then ColC1 end as ColC1
    ,case when ColSeq = 2 then ColA1 end as ColA2
    ,case when ColSeq = 2 then ColB1 end as ColB2
    ,case when ColSeq = 2 then ColC1 end as ColC2
    ,case when ColSeq = 3 then ColA1 end as ColA3
    ,case when ColSeq = 3 then ColB1 end as ColB3
    ,case when ColSeq = 3 then ColC1 end as ColC3
    from CTE)
    select ID
    ,max(ColA1) as ColA1
    ,max(ColB1) as ColB1
    ,max(ColC1) as ColC1
    ,max(ColA2) as ColA2
    ,max(ColB2) as ColB2
    ,max(ColC2) as ColC2
    ,max(ColA3) as ColA3
    ,max(ColB3) as ColB3
    ,max(ColC3) as ColC3
    from CTE2
    group by ID;

    Thanks,

    JP


    Tuesday, December 3, 2019 4:41 PM
  • Hi JP,

    >> "...I figured there would be a relatively easy way in SSIS to use the recordset output by EDI Source and run a SQL SELECT command against it to transform it from X rows and Y columns to (roughly) X/3 rows and 3Y columns.  The SQL is written and ready, but it's a single SELECT statement which runs as a set operation, aggregating the entire recordset at once, not operating on each row individually..."

    SSIS can execute a SELECT statement just for the data stored in the database. 

    IMHO, you would need to load the initial data into a staging table, and just after that apply the T-SQL with CTEs. I don't see any other way.

    Tuesday, December 3, 2019 6:16 PM
  • I was slowly coming to the same conclusion.  Thanks for replying so quickly!
    Tuesday, December 3, 2019 8:38 PM