none
Parsing data with SSIS package

    Question

  • I want to load data from a spreadsheet into a sql table.  One of the ten columns in the xls contains multiple bits of data. I want to parse the multiple bits into two columns in a table.  For example:

    - from the xls I want 2013-02-06  3:00 PM - 2013-02-06 3:00 PM

    - to end up as rows in two separate columns in a table 2013-02-06 3:00 PM  and 2013-02-06 3:00 PM

    How can I do this with an SSIS package?

    Thursday, March 28, 2013 9:10 PM

Answers

  • You can use a Derived Column transformation with two columns output.  The expressions in each is the following:

     LEFT( [Col] ,  FINDSTRING( [Col],"-" , 3) - 1 )
    TRIM(RIGHT( @[User::Variable] , LEN( @[User::Variable]) -  FINDSTRING( @[User::Variable],"-" , 3) )    )


    Russel Loski, MCT, MCSA SQL Server 2012, 2008, MCITP Business Intelligence Developer and Database Developer 2008 Twitter: @sqlmovers; blog: www.sqlmovers.com

    Thursday, March 28, 2013 10:24 PM
  • Thanks for all the help, Russ.  I have this bit working now. All the type casting was getting terribly confusing, so what I ended up doing was as follows:

    - in one derived column transformation:

    - ParsedStartTime = (DT_DBTIMESTAMP)SUBSTRING(Time,1,FINDSTRING(Time,"-",3) - 1)

    - ParsedEndTime = (DT_DBTIMESTAMP)RIGHT(Time,FINDSTRING(Time,"-",3) - 2)

    - CalculatedStartTime = DATEADD("mi",-((DT_I8)[Setup T]),(DT_DBTIMESTAMP)SUBSTRING(Time,1,FINDSTRING(Time,"-",3) - 1))

    • Marked as answer by STARgeek Friday, April 19, 2013 5:29 PM
    Friday, April 19, 2013 5:29 PM

All replies

  • You can use a Derived Column transformation with two columns output.  The expressions in each is the following:

     LEFT( [Col] ,  FINDSTRING( [Col],"-" , 3) - 1 )
    TRIM(RIGHT( @[User::Variable] , LEN( @[User::Variable]) -  FINDSTRING( @[User::Variable],"-" , 3) )    )


    Russel Loski, MCT, MCSA SQL Server 2012, 2008, MCITP Business Intelligence Developer and Database Developer 2008 Twitter: @sqlmovers; blog: www.sqlmovers.com

    Thursday, March 28, 2013 10:24 PM
  • Thanks, Russ, for the reply and please have mercy on a noob.  In the Derived Column Editor I have two columns:  1. ParsedStartTime and 2.ParsedEndTime for Derived Column Name; both have <add new> as Derived Column and both will use a column named Time from the xls sheet.  I am having trouble getting the expresion bit to work.  Can you help with how the expression for each column needs to look?
    Friday, April 12, 2013 4:34 PM
  • What is your expression so far?  And what are some examples of the string that you are getting from your Excel spreadsheet?

    Russel Loski, MCT, MCSA SQL Server 2012, 2008, MCITP Business Intelligence Developer and Database Developer 2008 Twitter: @sqlmovers; blog: www.sqlmovers.com

    Friday, April 12, 2013 4:36 PM
  • The expression is:  LEFT( [Time] ,  FINDSTRING( [Time],"-" , 3) - 1 )

    A mouse-over on the expression field tells me that "The function 'LEFT' was not recognized and "LEFT" is not a string fuction choice in the expression builder?

    The sample values in the xls column Time are: 

    2013-02-06  3:00 PM - 2013-02-06 3:00 PM and 2013-02-06 10:30 AM - 2013-02-06 11:30 AM

    Friday, April 12, 2013 4:59 PM
  • And I can get the editor to accept the expression

    RIGHT( [Time] ,  FINDSTRING( [Time],"-" , 3) + 2 )

    for the derived field ParsedEndTime

    Friday, April 12, 2013 5:05 PM
  • Which version of SSIS are you using?

    Russel Loski, MCT, MCSA SQL Server 2012, 2008, MCITP Business Intelligence Developer and Database Developer 2008 Twitter: @sqlmovers; blog: www.sqlmovers.com

    Friday, April 12, 2013 5:24 PM
  • I see the problem.  Microsoft added the LEFT function in SSIS 2012.

    Try:

    SUBSTRING( [Time] , 1,  FINDSTRING( [Time],"-" , 3) - 1)


    Russel Loski, MCT, MCSA SQL Server 2012, 2008, MCITP Business Intelligence Developer and Database Developer 2008 Twitter: @sqlmovers; blog: www.sqlmovers.com

    Friday, April 12, 2013 5:26 PM
  • We have a good bingo!

    Thanks, Russ.  When I grow up, I want to be just like you!

    Friday, April 12, 2013 5:39 PM
  • Hey, Russ.  This is my first kick at an SSIS package, so feel free to tell me when (not if) I get to be a pain in the arse.  I have a couple more bits that i cannot resolve from documentation:

    1.  How and where do I add a WHERE clause to filter the data coming from the XLS?

    2.  Is there a way to drop the reulting table and recreate it each time I run the package?

    2.  My derived columns show as type unicode string [DT_WSTR] - how and where can I convert them to DateTime so that I might further manipulate them later on (and outout them as part of an .XML file for uploading into another application)

     

    Friday, April 12, 2013 6:27 PM
  • Thanks for all the help, Russ.  I have this bit working now. All the type casting was getting terribly confusing, so what I ended up doing was as follows:

    - in one derived column transformation:

    - ParsedStartTime = (DT_DBTIMESTAMP)SUBSTRING(Time,1,FINDSTRING(Time,"-",3) - 1)

    - ParsedEndTime = (DT_DBTIMESTAMP)RIGHT(Time,FINDSTRING(Time,"-",3) - 2)

    - CalculatedStartTime = DATEADD("mi",-((DT_I8)[Setup T]),(DT_DBTIMESTAMP)SUBSTRING(Time,1,FINDSTRING(Time,"-",3) - 1))

    • Marked as answer by STARgeek Friday, April 19, 2013 5:29 PM
    Friday, April 19, 2013 5:29 PM