Tip

Using full-text search for symbols in SQL Server

Michelle Gutzait, Contributor

Problem:

I have a table with a full-text-search index defined on it. I would like to search an exact SQL expression including symbols such as ! " @, but the FTS is ignoring these symbols by default. For example, if I search for the exact phrase: "SQL Server!" using CONTAINSTABLE, I get all expressions containing "SQL Server" only – the exclamation point is ignored in the search.

How can I work around this issue?

Explanation:

FTS differentiates between a symbol defined as a wildcard and the other symbols. Here are a few symbols used in FTS as wildcards:

Expression Syntax Description
Any Character . Matches any single character except a line break.
Zero or more * Matches zero or more occurrences of the preceding expression, making all possible matches.
One or more + Matches at least one occurrence of the preceding expression.
Beginning of line ^ Anchors the match string to the beginning of a line
End of line $ Anchors the match string to the end of a line.
Beginning of word <</td> Matches only when a word begins at this point in the text.

    Requires Free Membership to View

End of word > Matches only when a word ends at this point in the text.
Line break \n Matches a platform-independent line break. In a Replace expression, inserts a line break.
Any one character in the set [ ] Matches any one of the characters within the [ ]. To specify a range of characters, list the starting and ending character separated by a dash (-), as in [a-z].
Any one character not in the set [^...] Matches any character not in the set of characters following the ^.
Or | Matches either the expression before or the one after the OR symbol (|). Mostly used within a group. For example, (sponge|mud) bath matches "sponge bath" and "mud bath."
Escape \ Matches the character that follows the backslash (\) as a literal. This allows you to find the characters used in regular expression notation, such as { and ^. For example, ^ Searches for the ^ character.
Tagged expression {} Matches text tagged with the enclosed expression.
C/C++ Identifier :i Matches the expression ([a-zA-Z_$][a-zA-Z0-9_$]*).
Quoted string :q Matches the expression (("[^"]*")|('[^']*')).
Space or Tab :b Matches either space or tab characters.
Integer :z Matches the expression ([0-9]+).

The problem is that full-text Search in SQL Server does not index symbols, such as ^, +, ! and @, and therefore these symbols are ignored when part of an FTS query. The wildcard symbols can be treated as regular symbols if the escape character "\" appears before.

Example #1: Wildcards

The following query uses the escape character "\" to indicate that the following wildcard symbol should be treated as a simple character ("+"):

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"New World \+"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]

Returns:

ID Title
2 Welcome to The Brave New World !
234 New World Modernisms (New World Studies)
54354 It is a New World +
324 Conquest of the New World, 1492-1640
27432 A Brave New World
6753 Brave New World
12333 Brave New World & Brave New World Revisited
5657 The End of the New World

Example #2: Non-wildcard symbols

The following query returns no results:

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"@"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]

This is because FTS does not index the "@" character, while the following query:

select * from dbo.FTS_Table where Title like '%@%'

Returns:

ID Title
3455 michelle@hotmail.com
654 Face many questions (Issues @ Work)
23423 Book: Coping with Technology @ Work.
231 Meet @ my place
60302 One Night @ a Call Center.
534 Special Fitness program @ www.SpecialFitness.com

That's because the LIKE clause searches the character in the string without using the FTS index.

Workaround

It is possible to add a WHERE clause with LIKE as follows:

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"New World!"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]
WHERE title LIKE '%New World !%'

Results:

ID Title
2 Welcome to The Brave New World !

Performance

When looking at the performance implications of the WHERE clause, it shows no difference between the two methods:


Figure 1: Query execution plan shows no performance difference using the WHERE clause in FTS.

Statistics IO:

(10 row(s) affected)
Table 'FTS_Table'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

(0 row(s) affected)
Table 'FTS_Table'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Profiler:


Figure 3: SQL Profiler shows no performance impact when using the WHERE clause in FTS. (Click on image for enlarged view.)

Ranking symbols

There are cases when you must return all phrases containing the exact phrase -- no matter if there are symbols or not -- but you can rank the exact phrase containing the symbol higher. For that matter, a CASE can be added to the query and the results can be ordered by it:

SELECT ID, Title,
CASE WHEN title like '%New World !%' then 1
ELSE 2
END AS ord

FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo. FTS_Table, (title), '"New World !"', 10)
AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]
ORDER BY 3

Results:

ID Title ord
2 Welcome to The Brave New World ! 1
234 New World Modernisms (New World Studies) 2
54354 It is a New World + 2
12333 Brave New World & Brave New World Revisited 2
27432 A Brave New World 2
5657 The End of the New World 2
324 Conquest of the New World, 1492-1640 2
6753 Brave New World 2

If the column "ord" should not be returned, a Common Table Expression (CTE) can be used:

WITH fts_example (id, title,ord) AS
(SELECT ID, Title, CASE WHEN title like '%New World !%' then 1
ELSE 2
END
FROM dbo. FTS_Table
INNER JOIN CONTAINSTABLE(dbo. FTS_Table, (title), '"New World!"', 10)
AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY])
select ID, Title from fts_example
order by ord

In comparing the last two queries, there's no difference in the execution plan,

Related SQL Server tips:

 statistics IO or SQL Profiler.

If you have to query exact phrases including symbols using full-text Search in SQL Server, you need to add your own logic, such as a WHERE clause with LIKE containing your symbol, a CASE in the SELECT list, etc. Performance is not dramatically decreased -- and it's often not decreased at all.

ABOUT THE AUTHOR
Michelle Gutzait works as a senior database consultant for ITERGY International Inc., an IT consulting firm specializing in the design, implementation, security and support of Microsoft products in the enterprise. Gutzait has been involved in IT for 20 years as a developer, business analyst and database consultant. For the last 10 years, she has worked exclusively with SQL Server. Her skills include SQL Server infrastructure design, database design, performance tuning, security, high availability, VLDBs, replication, T-SQL/packages coding, and more.

This was first published in August 2008

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.