none
Importing Flat file source with mixed date format to OLE DB Destination RRS feed

  • Question

  • Hi everyone, I'm working with a small dataset for my project at school. In the flat file source (.csv file), there are 2 different date columns, but both have the same issue: mixed date format (dd-mm-yy and mm/dd/yyyy to be precise). It will look something like this:

    |Order date|

    5/25/2014 (mdy format)

    25-5-14 (dmy format)

    When I open it with Excel, the mm/dd/yyyy data cell has been set to date format, while the dd-mm-yy is just general cell format

    (but I guess this doesn't affect much)

    My goal is to import them to a SQL table using Visual Studio SSIS, with some extra columns like week, quarter, or day, month, year seperately.  But the problem is, Visual studio SSIS doesn't detect the date format right, leading to wrong data in SQL table:

    25-05-14 (which is supposed to be 25th May 2014), but it turns out to be:

    14/5/2025 (I don't know what going on!)

    I wonder if there are any method to get these two date format into one, or at least something to make the import process detect the date data right (I would also like to keep the source file unchanged, I'm only allowed to work with this on Visual Studio). Any helps would be apppreciated. Thank you!


    • Edited by tamtn98 Monday, December 2, 2019 2:40 PM Website lag during edit
    Monday, December 2, 2019 2:37 PM

All replies

  • Are there only those two date formats? If so than in SSIS when you import the CSV file leave the date field as varchar. Add a conditional split task and test for the two digit year. Send this down one pipe with the 4 digit year down the other. Add derived column tasks to each pipe and convert the varchar dates to date fields. Add a union all task.

    Add a calendar table to your database to use to get the other fields. I would not add these to this table but that is up to you. If you want to add the fields to this table use a lookup task in SSIS to the calendar table.


    Monday, December 2, 2019 3:12 PM
  • Hi tamtn98,

    You need to treat that *.csv file column value as a string data and convert it into ISO 8601 format:

    yyyy-mm-dd. Useful link: ISO 8601 Date and time format

    SSIS has Derived Column Transformation. That's where you need to use SSIS Expression language functions:

    The end result shall be as follows:

    5/25/2014 (mdy format) => 2014-05-25
    25-5-14 (dmy format)   => 2014-05-25


    Monday, December 2, 2019 3:23 PM
  • I got your idea, but I still don't understand much about the conditional split task. Specifically, how do I make it be able to tell which row has the dash/two-digit year format and which row has the other kind?
    It would be great if you're able to give me a tutorial to execute this more clearly, since I got very few experience in this.

    I will try all available methods you guys gave me and reply later. Thank you!

    Monday, December 2, 2019 4:05 PM
  • Quite complicated but could be helpful, I will try to read the tutorials and work with it, and I will reply later with the result. Thank you so much!
    Monday, December 2, 2019 4:09 PM
  • In the conditional split task check the length of the string and if it is 8 then you have a two digit year.
    Monday, December 2, 2019 4:14 PM
  • Hi tamtn98,

    There is no need for the SSIS Conditional Split Task.

    You can use the following methods for the conditional expression (… ? ... : ...):

    Just follow my earlier reply verbatim.

    Monday, December 2, 2019 4:15 PM
  • Hi tamtn98,

    Using Conditional Split and Derived Column may be easy to understand.

    Conditional Split:

    Derived Column1:

    Order date == "" ? NULL(DT_DBDATE) : (DT_DBDATE)(TOKEN(Order date,"/",3)

    + "-" + TOKEN( Order date ,"/",1) + "-" + TOKEN(Order date,"/",2))

    Derived Column2:

    [Order date] == "" ? NULL(DT_DBDATE) : (DT_DBDATE)("20"+ TOKEN([Order date],"-",3)

    + "-" + TOKEN([Order date],"-",2) + "-" + TOKEN([Order date],"-",1))

    Result:

    Best Regards,

    Lily


     


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Tuesday, December 3, 2019 5:27 AM
  • Hi Lily,

    I considered to use the LEN() function too, but its use has a flaw for the current data set. It could produce the same numeric value 8 for both formats.

    For example:

    5/5/2014 (mdy format) => 2014-05-05, LEN('5/5/2014') = 8
    25-12-14 (dmy format)   => 2014-12-25, , LEN('25-12-14') = 8

    Tuesday, December 3, 2019 5:58 AM
  • Hi Yitzhak,

    You are right, the LEN() function is not comprehensive, so I tried the following way by using FINDSTRING function:

     

    Best Regards,

    Lily


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    • Proposed as answer by Lily Lii Thursday, December 5, 2019 1:54 AM
    Tuesday, December 3, 2019 8:15 AM