none
Ideal collationsfor a DB working with several-language inputs and also some binary-stored data

    Question

  • Hello,

    we are building a DB which is English by default but

    should be used in several local environments i.e. Chinese or Swedish customers should be able to enter data and view them with relevant characters in all levels like queries and views, expotred views etc.

    Not only unicode 16 and non-unicode characters but also mapping information should be stored, like route string for a photo or locations of a building map etc.. (minority of the count of the user funcionalities I think)

    Trying to find the appropriate collation to show & store all characters authentically and to require least collation coding on join and variable level I concluded the below: 

    The most covering and economical solution is if 

    I use collation Latin1_General100_CI_AS_KS_WS_SC as a default / server collation and would use 

    Latin1_General100_BIN2 only in the columns / procedures / statements / joins where this is relevant. 

    As I am a beginner and not 100% sure if I understand all the terminology (e.g. locale) and aspects which interfere, 

    As I am a beginner I would appreciate if you please confirm the correctness of my conclusion - and also add checkpoints / aspects I might have missed. 

    Thank you, regards: N. 

    Monday, November 04, 2013 8:45 PM

Answers

All replies

  • I use collation Latin1_General100_CI_AS_KS_WS_SC as a default / server collation and would use 

    Latin1_General100_BIN2 only in the columns / procedures / statements / joins where this is relevant. 

    Then just make the server collation Latin1_General100_BIN2 also.

    Collation is a column level property. If the same for the columns, you don't need to use the COLLATE clause in JOINs.

    Chinese letter storage example:

    http://www.sqlusa.com/bestpractices2005/chinesecharacters/

    For geography questions visit the SQL Server Spatial forum.


    Kalman Toth Database & OLAP Architect IPAD SELECT Query Video Tutorial 3.5 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    • Marked as answer by NunoGomez Tuesday, November 05, 2013 1:11 PM
    Tuesday, November 05, 2013 9:42 AM
    Moderator
  • Thank you for your prompt answer!

    Also if I may. It is nice to meet a Hungarian expert  - I am also Hungarian - I have happily used a few tutorials of yours in youtube too. 

    Would you please explain a bit more detailed why you propose the above. As this is so fundamental a setup I'd like to well understand why this is best.

    To me it seems to be a contradiction to use Latin1_General100_BIN2 as a server default collation, while, as I wrote it is the minority of the data and the functions which should store / treat binary data.

    Should I still make BIN2 server default? Why? Because if I collate the relevant columns Latin1_General100_CI_AS_KS_WS_SC,  and then

    the server default collation Latin1_General100_BIN2 and

    the all tables' level collations Latin1_General100_BIN2 and

    the column level collations Latin1_General100_CI_AS_KS_WS_SC 

    will resolve the conversion issues "automatically" i.e. without adding the COLLATE clauses where the 2 collations meet? 

    Thank you very much for your help. Sincerely: N.

    Tuesday, November 05, 2013 11:06 AM
  • >the minority of the data and the functions which should store / treat binary data.

    You need to think about the pros & cons of mixed collation database.

    The downside: you create collation confusion for the life of the database. Nothing more frustrating to database developers in T-SQL programming than mixed collations.

    The upside: if you really need mixed collation for some reason, SS supports it.

    There is a big difference between changing the collation of some columns (pretty invisible) or adding the COLLATE clause (visible) to some operations. The latter is pretty benign since it is part of the query.

    Collation article:

    http://www.sqlusa.com/bestpractices2005/collatedatabasedefault/


    Kalman Toth Database & OLAP Architect IPAD SELECT Query Video Tutorial 3.5 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012





    Tuesday, November 05, 2013 11:56 AM
    Moderator
  • Thank you very much!
    Tuesday, November 05, 2013 1:12 PM