none
[Forum FAQ] The field terminator will display strangely in a XML format file after using a bcp utility to specify the terminator with character values greater than 127 or less than 32. RRS feed

  • General discussion

  • Question

    In SQL Server 2008R2 and SQL Server 2012, when you specify the field terminator in a XML format file by using a bcp utility, if the field terminator with character values greater than 127 or less than 32 is typed in the command line, there will generate a strange terminator in the related XML format file. For example, if you run the following command line to generate a XML format file with a double dagger symbol ‡ (alt + 0135) as a special terminator.
    bcp  <Databasename>.<schemename>.<tablename>  format nul -c -x -f "<path>...xml" -T -S <Servername> -t‡

    The format file contains like as follows:

     
    Now, we can notice that the TERMINATOR="ç" (Minuscule c-cedilla) is output instead of TERMINATOR="‡" in the BCP XML format file. The ç character, is strangely enough because its character value is alt + 135 (and not alt + 0135). However, if you choose to generate a non-XML format file (the '‡' character is output in the non-format file correctly).
    In command line :
    bcp  <Databasename>.<schemename>.<tablename>  format nul -f <filename>.Fmt –n -T -S <Servername> -t‡

    A non-XML format file:

    Analysis

    When we create an XML document, type in data, stick a few tags around it, make the tags well-formed, then we try and load it up, there will get an unexpected error message from the Microsoft® XML Parser (MSXML), just as the above situation, the platform that is receiving your data stores it differently than the platform from which you sent it, resulting in character encoding problems.
    For example, when you type Alt+135(ç) in command line as field terminator, in XML format file, the terminator will display Alt+231(τ). Standard text formats are built on standard character sets. All computers store text as numbers. However, different systems can also store the same text using different numbers. The following table shows how a range of bytes is stored, first on a typical computer running Microsoft Windows® using the default code page 1252.

    Work Ground

    Considering to the above character encoding in XML format file, you can choose to use a bcp command to create a Non-XML format file, specify the format argument and use nul instead of a data-file path, then use .Fmt as the file name extension. Or you use the special character with values greater than 127 or less than 32 as the field terminator in creating a XML format file.
    The Related Links
    http://msdn.microsoft.com/en-us/library/aa468560.aspx#xmlencod_topic2
    http://technet.microsoft.com/en-us/library/ms162802.aspx
    http://technet.microsoft.com/en-us/library/ms191516.aspx

    Applies to

    SQL Server 2012
    SQL Server 2008R2


    Please click to vote if the post helps you. This can be beneficial to other community members reading the thread.

    Monday, April 28, 2014 4:18 AM