MS SQL Server 2012
In SQL server, few specific data handling scenarios require Unicode source data to be moved to non-Unicode destination format like ASCII
format. During data conversion, it can be observed that the Unicode characters getting converted to ‘?’ instead of retaining its Unicode data meaning in the destination.
Unicode data in SQL Server or flat file source if ported to SQL server table in a VARCHAR column, using SSIS, BCP or any other mean will
show similar behavior.
MS SQL Server by default supports
Code Page 1252 (ISO Character Set - SQL_Latin1_General_CP1_CI_AS) but it differs in how it implements other code pages related to other ISO standards. Below query will give details on
all the supported code pages in SQL Server...
Little tricky part is SQL Server after 2005 to SQL Server 2008 R2 does not support code page 65001 (UTF-8 encoding) and making it difficult
to move data between UTF16 – UTF8 encoding without losing data. Given above using SQL Server versions mentioned above Due to this limitation, it is impossible to port Unicode data from NVARCHAR column to VARCHAR column without losing special characters.
Good news is SQL Server 2012 supports porting of data from UTF16 – UTF 8 encoding. i.e. it is possible to export Unicode data to file
with code page 65001 (Down convert) and also import back in to column with VARCHAR data type in SQL table (Up convert). This can be achieved by first exporting Unicode data to a file with code page 65001 (Down convert) and then import back in to VARXHAR column
in SQL server table(Up convert)
In SQL Server 2012 there is a support for code page 65001, so one can use import export wizard quickly to export data from SQL table to
non-Unicode format (also can save resulting SSIS package for further use) and import that back to SQL Server table in table with VARCHAR column. Only difference is the data in VARCHAR column will look like garbage because of native data character set conversion,
but will retain original value based on collation setting for the object.
To obtain the original Unicode value one can up convert this VARCHAR data back into NVARCHAR data using specific code page and importing
exported flat file back into Unicode column.
Use SQL import export wizard (?) to export Unicode data to UTF8, using code page 65001 to flat file destination
2. Create new table with similar schema but replace Unicode column data type with non-Unicode data column (i.e. separate table with VARCHAR column for NVARCHAR column data)
3. Import the exported flat file in to new table and it should be successful but original Data will look different like shown below based on the column collation.
4. If there is a need to up convert the data from UTF-8 to UTF -16 i.e. from VARCHAR to NVARCHAR back to original value , just need to export it back again to flat file and import
again in NARCHAR column and it will retain the original value
5. If data is being used through the web application then up conversion can be handled by client application using appropriate encoding.