For instance, most Web applications store Unicode data in the UTF-8 format. UTF-8 stores all 7-bit ASCII characters as is and then uses special control characters to store the rest of the Unicode set. SQL Server, on the other hand, uses UCS-2 or UTF-16 Unicode format, which mimics how the 32-bit Windows kernel itself stores information. (This way data doesn't have be to be converted back and forth to another format and performance is enhanced.) If UTF-16 data in the database is retrieved and misinterpreted as UTF-8, the result is usually gibberish. If this mangled data is reinserted into the database, the data will most likely be ruined.
To avoid such problems you must do two things.
1. Set the codepage to 65001 for any pages that contain data retrieved from the server. This will automatically convert any UCS-2/UTF-16 data to UTF-8 format when the page is rendered and sent to the client.
2. Also set the codepage to 65001 for any data sent from a Web page, since the Web server will also automatically convert UTF-8 data sent to SQL Server into UCS-2.
Note that a database won't explicitly support Unicode data unless
Requires Free Membership to View
About the author: Serdar Yegulalp is editor of the Windows Power Users Newsletter. Check it out for the latest advice and musings on the world of Windows network administrators -- and please share your thoughts as well!
More information from SearchSQLServer.com
This was first published in August 2005

Join the conversationComment
Share
Comments
Results
Contribute to the conversation