none
Using OPENROWSET to scan through files

    질문

  • I have a whole directory of flat text files. The column delimiter is not a comma and I'm using SQL 2008 R2 in this instance. I want a quick way to look in the files and maybe even query max and min on columns to get an idea of what I am looking at. The files are fairly large, both in the number of columns and number of rows so Excel of CSVed isn't really practical. So I plumped for OPENROWSET using dynamic SQL. It worked with my example file but fails in practice. The file I am failing on has 314 columns (total row width is below 8000 but variable widths for each column).

    It works for my example so the principle is correct. So why the failure? Codepage issue as the delimiter in practice = ‡. Or is there a limit on the number of columns? 255 without XML, perhaps? The error I am getting is:

    Msg 4866, Level 16, State 8, Line 4

    The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.

    Msg 7301, Level 16, State 2, Line 4

    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

    Code below:


    JCEH

    2018년 6월 12일 화요일 오전 10:40

모든 응답

  • Code is too long. What a PITA.


    JCEH

    2018년 6월 12일 화요일 오전 10:41
  • I know, here's the resultant bit of code that actually pull the contents of the file plus the result.

    --- Works!

    SELECT *

    FROM OPENROWSET (BULK 'D:\Temp\test.txt'

    ,FORMATFILE = 'D:\Temp\test.fmt'

    ,CODEPAGE = 'RAW',FIRSTROW=2

            ) AS t1;



    --- Results

    Column1 Column2 Column3

    a b c

    1 2 3

    And here's the contents of the format file generated by the script and used:

    9.0
    3
    1 SQLCHAR 0 0 "|" 1 [Column1] ""
    2 SQLCHAR 0 0 "|" 2 [Column2] Latin1_General_CI_AS
    3 SQLCHAR 0 0 "\r\n" 3 [Column3] ""


    JCEH

    2018년 6월 12일 화요일 오후 12:10
  • It works for my example so the principle is correct. So why the failure? Codepage issue as the delimiter in practice = ‡. Or is there a limit on the number of columns? 255 without XML, perhaps? The error I am getting is:

    Msg 4866, Level 16, State 8, Line 4

    The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.

    Msg 7301, Level 16, State 2, Line 4

    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

    Code below:


    JCEH

    Hi JCEH,

    Per the error message above, Please try to replace the last ROWTERMINATOR with "0x0a" or other row terminators.

    Please see:

    “Column is too long” error with BULK INSERT

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    2018년 6월 13일 수요일 오전 8:55
  • Thanks, Will.

    I probably should mention I have already bulk inserted the file using \n as row terminator. Opening the file with Notepad++ I can see CR LF as the row terminator so \r\n should (!) be working.


    JCEH

    2018년 6월 13일 수요일 오전 11:15
  • Thanks, Will.

    I probably should mention I have already bulk inserted the file using \n as row terminator. Opening the file with Notepad++ I can see CR LF as the row terminator so \r\n should (!) be working.


    JCEH

    Hi JCEH,

    Per the error message above, it prompts us that the error is from the row1,column 1. So we could tell the issue is from the statement. Please try to remove the property "CODEPAGE","FIRSTROW" and see if it works.

    		SELECT *
    
    FROM OPENROWSET (BULK 'D:\Temp\test.txt'
    
    ,FORMATFILE = 'D:\Temp\test.fmt'
    
    --,CODEPAGE = 'RAW',FIRSTROW=2
    
            ) AS t1;

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    2018년 6월 15일 금요일 오전 7:27
  • You say in your first post that the delimiter is ‡, but then you have a format file with | as delimiter. How come?

    More importantly, if there are ‡ characters in the file, the files are Unicode files of some encoding and you need to change your command and format file to adapt to that. Exactly how depends on whether the encoding is UTF-8 or UTF-16.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    2018년 6월 15일 금요일 오전 8:29