locked
compare unicode string wrong? RRS feed

  • Question

  • you can try:

    select N'Trần Văn Đang' as column1, N'Trần Văn Ðang' column2 into #Test;

    select * from #Test where column1=N'Trần Văn Đang' ; --  --> result 1 row

    select * from #Test where column2=N'Trần Văn Đang'; -- --> No row return????

    I has detected it fail at 'Đ' character.

    someone explain and show me how to compare it?

    Wednesday, August 1, 2012 11:04 AM

Answers

  • They must fail, cause they are not identical:

    SELECT  * ,
            CAST(column1 AS VARBINARY(MAX)) ,
            CAST(column2 AS VARBINARY(MAX))
    FROM    #Test

    Use an appropriate collation:

    SELECT  *
    FROM    #Test
    WHERE   column1 = column2 COLLATE Thai_CI_AI;
    

    • Marked as answer by ProgrammerVN Wednesday, August 1, 2012 12:31 PM
    • Unmarked as answer by ProgrammerVN Wednesday, August 1, 2012 12:31 PM
    • Marked as answer by ProgrammerVN Wednesday, August 1, 2012 4:32 PM
    Wednesday, August 1, 2012 11:15 AM

All replies

  • They must fail, cause they are not identical:

    SELECT  * ,
            CAST(column1 AS VARBINARY(MAX)) ,
            CAST(column2 AS VARBINARY(MAX))
    FROM    #Test

    Use an appropriate collation:

    SELECT  *
    FROM    #Test
    WHERE   column1 = column2 COLLATE Thai_CI_AI;
    

    • Marked as answer by ProgrammerVN Wednesday, August 1, 2012 12:31 PM
    • Unmarked as answer by ProgrammerVN Wednesday, August 1, 2012 12:31 PM
    • Marked as answer by ProgrammerVN Wednesday, August 1, 2012 4:32 PM
    Wednesday, August 1, 2012 11:15 AM
  • Hello,

    You can check it with the Unicode function: They are different characters:

    SELECT UNICODE (N'Đ')  -- Returns 272
    SELECT UNICODE (N'Ð')  -- Returns 280


    Olaf Helper
    * cogito ergo sum * errare humanum est * quote erat demonstrandum *
    Wenn ich denke, ist das ein Fehler und das beweise ich täglich
    Blog Xing

    Wednesday, August 1, 2012 11:16 AM
  • It fails at 'ầ' charachter:

    SELECT ASCII ('ầ')  -->  226

    SELECT ASCII ('ầ')  -->  63


    Wednesday, August 1, 2012 11:17 AM
  • The ầ you have as Column 1 seems to be put to together by a â and a ` (226 and 96), where the one you have as Column 2 is ? (63).

    If you look uop the ASCII function in BOL, they have the below example for a script that can give you all the ASCII values for a given string. This can some times be quite handy in cases like this. You can also use it with the UNICODE function in case you want to look at the UNICODE value.

    Here's the script you can find in BOL -

    DECLARE @position int, @string nvarchar(150)
    -- Initialize the variables.
    SET @position = 1
    SET @string = 'Trần Văn Đang'
    WHILE @position <= LEN(@string)
       BEGIN
       SELECT ASCII(SUBSTRING(@string, @position, 1)),
          CHAR(ASCII(SUBSTRING(@string, @position, 1)))
        SET @position = @position + 1
       END


    Steen Schlüter Persson (DK)


    Wednesday, August 1, 2012 11:39 AM