Home > SQL Server Tips > Database Management and Administration > Preserving Unicode data integrity
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATABASE MANAGEMENT AND ADMINISTRATION

Preserving Unicode data integrity


Serdar Yegulalp, Contributor
08.11.2005
Rating: --- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


Unicode is one of the most broadly-accepted standards for storing data using a mixture of international character sets. It has been available in SQL Server since version SQL Server 7.0. However, if you are not careful in managing how data is sent to SQL Server, mixing Unicode's multiple encoding methods can lead to damaged or improperly stored data.

For instance, most Web applications store Unicode data in the UTF-8 format. UTF-8 stores all 7-bit ASCII characters as is and then uses special control characters to store the rest of the Unicode set. SQL Server, on the other hand, uses UCS-2 or UTF-16 Unicode format, which mimics how the 32-bit Windows kernel itself stores information. (This way data doesn't have be to be converted back and forth to another format and performance is enhanced.) If UTF-16 data in the database is retrieved and misinterpreted as UTF-8, the result is usually gibberish. If this mangled data is reinserted into the database, the data will most likely be ruined.

To avoid such problems you must do two things.

1. Set the codepage to 65001 for any pages that contain data retrieved from the server. This will automatically convert any UCS-2/UTF-16 data to UTF-8 format when the page is rendered and sent to the client.

2. Also set the codepage to 65001 for any data sent from a Web page, since the Web server will also automatically convert UTF-8 data sent to SQL Server into UCS-2.

Note that a database won't explicitly support Unicode data unless the field types in question are also explicitly supported. If you use text instead of ntext, for instance, any Unicode entities sent to the database will instead be translated into ISO-8859-1 encoding, which should not be intermixed with true Unicode entities to avoid data damage.

About the author: Serdar Yegulalp is editor of the Windows Power Users Newsletter. Check it out for the latest advice and musings on the world of Windows network administrators -- and please share your thoughts as well!


More information from SearchSQLServer.com

  • Tip: Performance impacts of joining Unicode and non-Unicode data
  • Ask the Experts: Convert English language database into Japanese
  • Ask the Experts: NVARCHAR vs. VARCHAR


  • Rate this Tip
    To rate tips, you must be a member of SearchSQLServer.com.
    Register now to start rating these tips. Log in if you are already a member.


    Submit a Tip




    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



    RELATED CONTENT
    Database Management and Administration
    Password cracking tools for SQL Server
    Using traces in SQL Server Profiler
    Meet compliance requirements with improved database security practices
    Hardening the network and OS for SQL Server security
    Securing the server and database in SQL Server
    How SQL Server 2008 components impact SharePoint implementations
    Troubleshooting Distributed Transaction Coordinator errors in SQL Server
    Achieving high availability and disaster recovery with SharePoint databases
    Clearing the Windows page file and its effect on server performance
    Deploying a SQL Server virtual appliance for Microsoft Hyper-V

    SQL Server Backup and Recovery
    SQL Server Mailbag: Data restoration and DB property management
    Achieving high availability and disaster recovery with SharePoint databases
    How to 'do' SQL Server disaster recovery
    The keys to database backup protection for SQL Server
    Choosing a SQL Server disaster recovery solution
    Licensing a standby server for SQL Server replication
    Can I encrypt and restore a database backup in SQL Server 2005?
    SQL Server errors, failures and other problems fixed from the trenches
    Get SQL Server log shipping functionality without Enterprise Edition
    SQL Server 2008 backup compression pros and cons
    SQL Server Backup and Recovery Research

    RELATED GLOSSARY TERMS
    Terms from Whatis.com − the technology online dictionary
    rollback  (SearchSQLServer.com)

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary

    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



    SQL Server Development - .NET, C#, T-SQL, Visual Basic
    HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts