none
SSIS - Flat File Connector Not Recognizing CRLF in Header RRS feed

  • Question

  • Good afternoon.  I could use another brain on the below issue:

    I need to import a file with 4 different data types in it. I'm working just on the second data type as I was able to get the first one to work.  
    The second data type is missing the first record after importing.

    I found that in the transition from the first data type, the header record is not being read correctly.  I use a conditional split to determine which records to import.  It checks the value in the 2nd and 3rd column.  If a record matches a the required condition, they are included.

    This doesn't work for the first record because the Header's line ending is not caught (CRLF). It is missed and the values in the 2nd and 3rd columns that should come from the record are from the header, as it overflowed into the first record.  
    The CRLF is caught for the second record.  But the data is kicked over for the first record.  I've tried several settings and combinations of settings,

    All of the other headers seem to be recognized. And they all seem to be there (in Notepad++), even the one header in question. I suspect the file, I've taken the CRLF out and added it back (in Notepad ++) in a new file, no change.  I don't see any other special characters in the file.

    The Set Up:
    SSIS on VS Enterprise 2017
    Flat File Connector:
     - Ragged-Right
     - Header row delimiter: CRLF
     - Header rows to skip: 1 ,but I've tried 0
     - Column names in the first data row: not checked 'cause there aren't
     - Code page: ANSI 1252
     - Unicode: not checked
     - Advanced tab:
       o All columns are spelled out according the the required file layout

       o Added the last column with a Column delimiter of CRLF
         I find that there is data in this column even though the Input/Output width is 0.

      o Header delimiter: CRLF


    Any ideas on what the issue might be?  

    Thanks,
    Rich 

     
    • Edited by RichS86 Friday, June 14, 2019 4:30 PM More information
    Friday, June 14, 2019 4:26 PM

Answers

All replies

  • Hi Rich,

    Could you please share some example data?

    Best Regards,

    Mona 


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Monday, June 17, 2019 7:34 AM
  • Thanks for your response.  Below is a description of what is happening and what should be happening.  “991” is the footer record from the previous data type.  I’m wanting to import all records that have a “02” in the first 2 positions and a “4” in the 3<sup>rd</sup>.

     

    What it should see and evaluate:

     

    991{CR}{LF}

    0140011082019{CR}{LF}

    02410108XXXX3 XXXXXXXXXJYOTXXXXX    -- 55chrs --       DXXXX  {CR}{LF}

     

     

    What it sees or is importing (header record and the first record on one line.  The first record is missed because the “02” and “4” are missed entirely:

     

    99110108XXXX3 XXXXXXXXXJYOTXXXXX    -- 55chrs --       DXXXX  {CR}{LF}

     

     

    What it’s missing:

     

    0140011082019

    024 

     

    (the “02” and “4” are missed along with other header data. The “02” and “4” are the fields that are used for the conditional split.)

    Monday, June 17, 2019 4:32 PM
  • Hi RichS86,

    • It is not clear what is the file structure from your data sample.
    • Is it possible to convert your file into XML format?
      All the problems you are facing will be gone.
    Monday, June 17, 2019 6:13 PM
  • Don't use "Ragged-Right" it is not such a file, use regular delimited

    Arthur

    MyBlog


    Twitter

    Monday, June 17, 2019 6:30 PM
    Moderator
  • Thanks, Yitzhak.  The file comes in from the outside as a text file.  I don't think it is in a format that I can convert it to XML. I would have to do it programmatically.  I'll look into it.
    Monday, June 17, 2019 9:38 PM
  • Thanks for your response Arthur.  The file has 4 different data types.  Each of the 4 has a different record length.  And there are no delimiters.  The format of the file is set by an outside authority and I can not change it.  50 different organizations are required to use the prescribed format so changing it is not an option.

    I've tried using a fixed width format, but other data types with wider record lengths caused an issue. 

    Should I pursue the fixed width solution further even though the file has different record lengths?  For this data flow, I'm only interested in the one record type.  I can deal with the others in a data flow later in the package.

    My issue with ragged-right seems to involve how I'm handling the last column which is where the CRLF for each row is located.  For the last column, the InputColumnWidth is 0 (since it is variable and uses the CRLF to set the length) and I'm setting the OutputColumnWidth to 3 even though the actual length of the data coming in is 1. Because I saw another post that recommended this.

    Sorry I don't have more information.  This issue doesn't really seem to be that complex; more like I'm missing a simple format item.

    Thanks.

    Monday, June 17, 2019 9:53 PM
  • Hi RichS86,

    "...The file has 4 different data types.  Each of the 4 has a different record length.  And there are no delimiters..."

    It is clear that your file is not a standard rectangular *.csv file with delimiters, separators, and line breaks.

    Please share your file sample and details about its structure so we can come up with some solution for your scenario.

    Tuesday, June 18, 2019 1:15 AM
  • In your post above

    991{CR}{LF}

    0140011082019{CR}{LF}

    CRLFs are the row delimiters that would be enough to set this file to be treated as comprised of one column.

    Next stage is to split the rows on the CRLFs using the Conditional Split as you attempted.

    Once the file is split into 4 files, process each separately.


    Arthur

    MyBlog


    Twitter

    Tuesday, June 18, 2019 1:58 AM
    Moderator
  • Thanks all for your assistance.  Below is an example of the records for this particular data type.  Some data has been obfuscated with "X"s.  
    The first row is a header. The last row is a footer. To the far left on the 2nd row the "02" is the record type ("02" = data record). The 3rd char is the data type "4", to distinguish it from the 3 other data types.  The next 10 chars are an ID value.  
    The person's name begins in position 24.  You can probably pick out the middle initial and the last name.  Everything else is just flags indicating certain statuses.  Each of these columns is consumed as a string "DT_STR".  There is no data conversion.
    I define the columns in the Flat File Connector in the Advanced tab, setting the name, InputColumnWidth and OutputColumnWidth of each.  The exception to this is the last column (U's and N's below).  It is one character in length the input is 0 and cannot be changed. It's greyed out because the connector will use the CRLF as the end.  All DataTypes are DT_STR.

    0140011082019  (<--- Header)
    02410108XXXXX XXXXXX705XXXXXXHKA                                              XXXTA                                 1008XXXX       WY17Y        032271000708031UUUUUU
    02410107XXXXX XXXXXX322XXXXXXU                                                XXXN                                  2001XXXX       WN   05141983065130500305034YNNNNN
    02410105XXXXX XXXXXX860XXXXXX                                                 XXXNG                                 2007XXXX       WY17Y        032011001202031UUUUUU
    02410108XXXXX XXXXXX961XXXXXXA                            E.                  XXXINE                                2001XXXX       WN   01041996065990500000034NNNNYN
    02410106XXXXX XXXXXX017XXXXXXLEY                          J.                  XXXLONG                               1007XXXX       WN07Y        032501001409031NNNNYN
    02410103XXXXX XXXXXX877XXXXXX                             A.                  XXXEY                                 2003XXXX       WN   03061990994991000000036NNNNYN
    02410107XXXXX XXXXXX342XXXXXX                                                 XXXPA                                 2004XXXX       WY   05241972994991000000036UUUUUU
    02410044XXXXX XXXXXX227XXXXXX                             L.                  XXXCHRIST                             1009XXXX       WN05N        043991000000031NNNNYN
    02410002XXXXX XXXXXX577XXXXXX                             R.                  XXXRK                                 2015XXXX       WN17Y        011991000000031NNNNYN
    02410003XXXXX XXXXXX654XXXXXX                             T.                  XXXDE                                 1004XXXX       WN17Y        043091001510031NNNNYN
    02410019XXXXX XXXXXX523XXXXXXRA                           K.                  XXX                                   1005XXXX       WN07N        043011001810031NNNNYN
    02410095XXXXX XXXXXX715XXXXXXLE                           L.                  XXXLEY                                1006XXXX       WN07N        043191001110031NNNNYN
    02410106XXXXX XXXXXX241XXXXXXY                            J.                  XXXDOON                               1012XXXX       WN18Y        021141000802531NNNNYN
    994   (<--- Footer)



    Thanks again for your time an patience.
    Rich
    • Edited by RichS86 Tuesday, June 18, 2019 3:12 PM
    Tuesday, June 18, 2019 3:10 PM
    • Marked as answer by RichS86 Tuesday, June 18, 2019 4:39 PM
    Tuesday, June 18, 2019 3:51 PM
    Moderator
  • I know.  There is nothing to it.  I wish all of our files where this straight forward.  I have it working now.  These are the only steps that I took:  Open the file connector, delete the Header row delimiter, left it blank, saved and closed.  Opened it back up, saw that it had defaulted to CRLF, I changed the header rows to skip to 1.  That's it.

    I've been doing ETL since DTS and there have been 1 (or maybe 2) cases where I'd get into something like this that had unexplained results.  One process I completely rewrote after having several team members proof it for me.  And it worked. 

    Not sure about this one, but it seems deleting the Header Row Delimiter in the connection caused the package to re-initialize this value.  Perhaps it somehow generated some malformed XML in the underlying code before.

    Yitzhak and Arthur, I appreciate your responses and will mark you both as Answered if it will let me.

     


    Tuesday, June 18, 2019 4:39 PM