none
Solving collate conflicts for reading data from more databases - SQL Server 2014-2019 RRS feed

  • Question

  • Hi,

    I'm implementing a data mart reading data from more databases.

    For some implemented joins it occurs an error about the collate of the joined columns.

    So, I'm searching a general solution to solve collate conflicts regardless of the specific collate.

    I'm trying to use the COLLATE DATABASE_DEFAULT clause with success. Is this the best general solution about the collate conflicts or does it exist a better solution?

    Thanks

    Monday, November 18, 2019 4:20 PM

Answers

  • Unfortunately, changing the column collations across the board in a database can be quite a labourous task. If none of the columns in questions are indexed or part of keys, it is not too tricky. But if you have columns that are part of indexes or primary keys, it starts to get difficult, since you need to drop all indexes to be able to change the collation. This also applies to foreign key and CHECK constraints.

    And depending what collation you change to and from, you may have problems that when you restore an indxe or a constraint, the constraint is no longer valid or there suddenly duplicate keys in the indexes.

    It may be easier to script the database, do a find/replace in the script create a new database and copy dat aa over. You will still get problem if data does not pass the rules with the new collation, but at least you don't have drop and recreate things en masse.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Tuesday, November 19, 2019 10:35 PM

All replies

  • COLLATE DATABASE_DEFAULT saves you from hardcoding a collation. On the other hand, if you want the comparisons to be, say, case-insensitive, it may be better to say which collation you want.

    No matter what, you may face performance issue, since casting the collation means that any index on the column can not used for seeks.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Monday, November 18, 2019 11:01 PM
  • Hi pscorca,

    To solve collate conflicts , use the COLLATE DATABASE_DEFAULT clause  might be a good way. However , as Erland said that 'you may face performance issue', Please check this article How column COLLATION can affect SQL Server query performance.

    Also , there a similar issue . Please check SQL SERVER – Cannot resolve collation conflict for equal to operation.

    Best Regards,

    Rachel 


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, November 19, 2019 3:17 AM
  • Hi,

    thanks for your replies.

    Where I can act, I'd like to change the collation for databases and tables.

    For a database e.g. I can use ALTER DATABASE my_database COLLATE Latin1_General_CI_AS, but for changing the collation for the all database tables is there a t-sql script, please?

    Thanks

    Tuesday, November 19, 2019 10:53 AM
  • Unfortunately, changing the column collations across the board in a database can be quite a labourous task. If none of the columns in questions are indexed or part of keys, it is not too tricky. But if you have columns that are part of indexes or primary keys, it starts to get difficult, since you need to drop all indexes to be able to change the collation. This also applies to foreign key and CHECK constraints.

    And depending what collation you change to and from, you may have problems that when you restore an indxe or a constraint, the constraint is no longer valid or there suddenly duplicate keys in the indexes.

    It may be easier to script the database, do a find/replace in the script create a new database and copy dat aa over. You will still get problem if data does not pass the rules with the new collation, but at least you don't have drop and recreate things en masse.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Tuesday, November 19, 2019 10:35 PM
  • Hi, I've found this collation change script that manages primary Keys, foreign keys and indexes. I've used it to change the SQL_Latin1_General_CP1_CI_AS collation of my database into the Latin1_General_CI_AS one.

    After the collation changing operation I've restored the original length of the nchar columns present in the tables of my database: this operation has doubled the original length of these fields.

    Wednesday, November 20, 2019 11:05 AM
  • That particular collation change is one of the least dangerous to do, at least for n(var)char. In fact, I believe that for Unicode data types, these two collations are identical, and if SQL Server would recognise that and treat them as equal a that would remove a lot of pain and misery related to collations, as this is a common collation conflict.

    What are  you saying about the nchar? Do you mean that Hugo's script inflated nchar(10) to nchar(20)?


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Wednesday, November 20, 2019 11:57 AM
  • Hi Erland,

    before the collation changing I've nchar(6) columns, nchar(2) columns and nchar(1) columns; after the collation changing I've respectively nchar(12) columns, nchar(4) columns and nchar(2) columns.

    No issues about nvarchar columns.

    Wednesday, November 20, 2019 12:31 PM
  • I found the issue. This line in the script:

    ELSE CASE WHEN ty.name = N'nvarchar'

    should read:

    ELSE CASE WHEN ty.name IN (N'nvarchar', N'nchar')

    (This is line 139 in Hugo's original script and line 150 in the version from Walter Charrière in the comments.)

    I've added a comment to the blog about this.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Wednesday, November 20, 2019 8:54 PM
  • Ok, thanks.

    Bye

    Thursday, November 21, 2019 7:58 AM