Tutorial

Character string data types

Data Types
Data types specify the type of information (such as number, string, picture, date) that can be stored in a column or a variable.
SQL Server recognizes 28 system-defined data types. Apart from these data types, you can create user-defined data types in T-SQL and in .NET to fulfill specific needs.

The following are the categories of system-defined data types:

  • Character strings
  • Unicode character strings


    Figure 3-1 Grouping objects using a schema in the Object Browser.

  • Date and time
  • Approximate numeric
  • Exact numeric
  • Integer numbers
  • Monetary
  • Binary
  • Special

    NOTE
    In some cases, you can use different identifiers to refer to a data type in T-SQL code. For example, the char data type can be referenced as character, and varchar can be referenced as character varying. Some of these synonyms are based on ANSI SQL-92 standard requirements.

    Figure 3-1 Grouping objects using a schema in the Object Browser

    Character Strings
    Character data types store character strings. The four different character types vary in length and storage characteristics:

  • char
  • varchar
  • varchar(max)
    • Requires Free Membership to View


    • text

      The char data type is used to store strings of fixed size. As noted earlier, the maximum size of this data type is 8,000 characters, which is a significant increase over the 255-character limit in early versions. When a variable or a table column is assigned with a string that is shorter than its nominal size, it is padded with trailing spaces to fill the specified field length.
      The varchar data type stores strings of variable size up to 8,000 characters long. When a character value whose length is less than the nominal size is assigned to the column or variable, SQL Server does not add trailing spaces to it, but records it as is. varchar data types occupy two additional bytes in order to record the length of the string.

      NOTE
      Maintenance of this information requires some additional computation during I/O operation, but that time is usually countered by savings in the space required. A record using such columns occupies less space, and more records fit into a single page. Therefore, SQL Server reads more records when accessing data, and it is more likely that a single page contains the information that the user is looking for.

      The text data type is used to store huge amounts of data. One field can store up to 2GB (231 – 1 bytes) of information. Only a 16-byte pointer to this data is stored in the table. Therefore, additional processing overhead is involved with the use of text columns. There are special functions for processing text values.

      NOTE
      You should avoid using this data type because Microsoft plans to remove it in future versions of SQL Server. You should use the new varchar(max) data type instead.

      Varchar(max) looks and operates like the varchar data type, but it is by internal structure and functionality actually much more like the text data type.
      It is also designed to store large strings (up to 2GB). The string is stored in special data pages, not in the data page with the rest of the record. Its biggest advantage is that most string operating functions that work with varchar also work with varchar(max); however, there are a few new functions for additional functionality. The following command creates a table with four fields using different character string data types:

      Create table Contacts(ContactId char(8),
                            Name varchar(50),
                            Note text,
                            Resume varchar(max))

      Character constants are delimited from the rest of the Transact-SQL code with quotes. For example, the following statement inserts contact information:

      insert into Contacts (ContactId, Name, Note, Resume)
      values ('CO-92-81', 'Tom Jones', 'Tom@trigon.com', 'N/a')

      Unicode Character Strings
      Microsoft SQL Server 2005 has four character data types for storing Unicode data—using non-ASCII character sets. They are equivalent to the char, varchar, varchar(max), and text data types and are called

    • nchar
    • nvarchar
    • nvarchar(max)
    • ntext

      The main difference between these new data types and the older character data types is that the new data types can hold Unicode characters, which occupy 2 bytes per character. Therefore, the maximum string length that they can store is half that of the corresponding older data types (4,000 for nchar and nvarchar).
      The following statement creates the same table as the previous example but uses Unicode data types:

      Create table Contacts_2(ContactId nchar(8),
                              Name nvarchar(50),
                              Note ntext,
                              Resume nvarchar(max))
      go

      Unicode character constants are also delimited with quotes but are prefixed with N':
      insert into Contacts_2 (ContactId, Name, Note, Resume)
      values (N'CO-92-81', N'Tom Jones', N'Tom@trigonblue.com', N'N/a')

      This N' prefix might look a little odd, but you will get used to it. Microsoft documentation is full of samples with Unicode constants. It was some time before I discovered the reason Microsoft uses N' as a prefix. It stands for "National." In fact, acceptable alternative identifiers for these data types are
    • National char
    • National char varying
    • National char varying(max)
    • National text

      TIP
      Typically, it is not a problem if you omit the N' prefix on constants. SQL Server automatically converts the string to its Unicode equivalent. Naturally, it is better to insert it whenever you are dealing with Unicode columns or variables, but it is not a big problem. The CPU will just have to perform a couple of extra cycles to make the conversion.
      However, there are cases in which it becomes a problem. When your string constant is part of a query criterion, then the presence of the N' prefix might significantly affect execution of the query. If the column is defined as a non-Unicode string and the criterion is specified with the N' prefix, SQL Server converts every row of the table to compare it with the Unicode constant. As a result, the query performs a table scan instead of using an index.


      Basic Transact-SQL Programming Constructs

       Home: Introduction
       Tip 1: T-SQL identifiers
       Tip 2: Database object qualifiers
       Tip3: Character string data types
       Tip 4: Date, time and number data types
       Tip 5: Special data types-Part 1
       Tip 6: Special data types-Part 2
       Tip 7: Local variables
       Tip 8: Global variables
       Tip 9: Table variables
       Tip 10: Flow control statements
       Tip 11: Blocks, looping and conditional statements
       Tip 12: Unconditional and scheduled execution
       Tip 13: Transact SQL cursors
       Tip 14: Cursor related statements
       Tip 15: Problems and justified use of cursors

      The previous tip is from "15 tips in 15 minutes: Basic Transact-SQL Programming Constructs," excerpted from Chapter 3, of the book "Microsoft SQL Server 2005 Stored Procedure Programming in T-SQL & .NET" by Dejan Sunderic, courtesy of McGraw-Hill Publishing.


      This was first published in February 2007

    There are Comments. Add yours.

     
    TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

    REGISTER or login:

    Forgot Password?
    By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
    Sort by: OldestNewest

    Forgot Password?

    No problem! Submit your e-mail address below. We'll send you an email containing your password.

    Your password has been sent to: