Using full-text search for symbols in SQL Server

When querying exact SQL phrases that include symbols, you need to add your own logic with SQL Server's full-text search feature, as it does not index non-letters. Options include incorporating a WHERE clause with LIKE to contain your symbol and also using a CASE in the SELECT list to rank the symbol and non-symbol results. In this tip, SQL Server expert Michelle Gutzait shares steps to successfully search for exact phrases that contain symbols.

Problem:

I have a table with a full-text-search index defined on it. I would like to search an exact SQL expression including symbols such as ! " @, but the FTS is ignoring these symbols by default. For example, if I search for the exact phrase: "SQL Server!" using CONTAINSTABLE, I get all expressions containing "SQL Server" only – the exclamation point is ignored in the search.

How can I work around this issue?

Explanation:

FTS differentiates between a symbol defined as a wildcard and the other symbols. Here are a few symbols used in FTS as wildcards:

Expression Syntax Description
Any Character . Matches any single character except a line break.
Zero or more * Matches zero or more occurrences of the preceding expression, making all possible matches.
One or more + Matches at least one occurrence of the preceding expression.
Beginning of line ^ Anchors the match string to the beginning of a line
End of line $ Anchors the match string to the end of a line.
Beginning of word <</td> Matches only when a word begins at this point in the text.
End of word > Matches only when a word ends at this point in the text.
Line break \n Matches a platform-independent line break. In a Replace expression, inserts a line break.
Any one character in the set [ ] Matches any one of the characters within the [ ]. To specify a range of characters, list the starting and ending character separated by a dash (-), as in [a-z].
Any one character not in the set [^...] Matches any character not in the set of characters following the ^.
Or | Matches either the expression before or the one after the OR symbol (|). Mostly used within a group. For example, (sponge|mud) bath matches "sponge bath" and "mud bath."
Escape \ Matches the character that follows the backslash (\) as a literal. This allows you to find the characters used in regular expression notation, such as { and ^. For example, ^ Searches for the ^ character.
Tagged expression {} Matches text tagged with the enclosed expression.
C/C++ Identifier :i Matches the expression ([a-zA-Z_$][a-zA-Z0-9_$]*).
Quoted string :q Matches the expression (("[^"]*")|('[^']*')).
Space or Tab :b Matches either space or tab characters.
Integer :z Matches the expression ([0-9]+).

The problem is that full-text Search in SQL Server does not index symbols, such as ^, +, ! and @, and therefore these symbols are ignored when part of an FTS query. The wildcard symbols can be treated as regular symbols if the escape character "\" appears before.

Example #1: Wildcards

The following query uses the escape character "\" to indicate that the following wildcard symbol should be treated as a simple character ("+"):

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"New World \+"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]

Returns:

ID Title
2 Welcome to The Brave New World !
234 New World Modernisms (New World Studies)
54354 It is a New World +
324 Conquest of the New World, 1492-1640
27432 A Brave New World
6753 Brave New World
12333 Brave New World & Brave New World Revisited
5657 The End of the New World

Example #2: Non-wildcard symbols

The following query returns no results:

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"@"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]

This is because FTS does not index the "@" character, while the following query:

select * from dbo.FTS_Table where Title like '%@%'

Returns:

ID Title
3455 michelle@hotmail.com
654 Face many questions (Issues @ Work)
23423 Book: Coping with Technology @ Work.
231 Meet @ my place
60302 One Night @ a Call Center.
534 Special Fitness program @ www.SpecialFitness.com

That's because the LIKE clause searches the character in the string without using the FTS index.

Workaround

It is possible to add a WHERE clause with LIKE as follows:

SELECT ID, Title
FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo.FTS_Table, (title),
'"New World!"', 10) AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]
WHERE title LIKE '%New World !%'

Results:

ID Title
2 Welcome to The Brave New World !

Performance

When looking at the performance implications of the WHERE clause, it shows no difference between the two methods:

Query execution plan shows no performance difference using the WHERE clause in FTS.
Figure 1: Query execution plan shows no performance difference using the WHERE clause in FTS.

Statistics IO:

(10 row(s) affected)
Table 'FTS_Table'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

(0 row(s) affected)
Table 'FTS_Table'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Profiler:

SQL Profiler shows no performance impact when using the WHERE clause in FTS
Figure 3: SQL Profiler shows no performance impact when using the WHERE clause in FTS. (Click on image for enlarged view.)

Ranking symbols

There are cases when you must return all phrases containing the exact phrase -- no matter if there are symbols or not -- but you can rank the exact phrase containing the symbol higher. For that matter, a CASE can be added to the query and the results can be ordered by it:

SELECT ID, Title,
CASE WHEN title like '%New World !%' then 1
ELSE 2
END AS ord

FROM dbo.FTS_Table
INNER JOIN CONTAINSTABLE(dbo. FTS_Table, (title), '"New World !"', 10)
AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY]
ORDER BY 3

Results:

ID Title ord
2 Welcome to The Brave New World ! 1
234 New World Modernisms (New World Studies) 2
54354 It is a New World + 2
12333 Brave New World & Brave New World Revisited 2
27432 A Brave New World 2
5657 The End of the New World 2
324 Conquest of the New World, 1492-1640 2
6753 Brave New World 2

If the column "ord" should not be returned, a Common Table Expression (CTE) can be used:

WITH fts_example (id, title,ord) AS
(SELECT ID, Title, CASE WHEN title like '%New World !%' then 1
ELSE 2
END
FROM dbo. FTS_Table
INNER JOIN CONTAINSTABLE(dbo. FTS_Table, (title), '"New World!"', 10)
AS KEY_TBL ON FTS_Table.ID = KEY_TBL.[KEY])
select ID, Title from fts_example
order by ord

In comparing the last two queries, there's no difference in the execution plan, statistics IO or SQL Profiler.

If you have to query exact phrases including symbols using full-text Search in SQL Server, you need to add your own logic, such as a WHERE clause with LIKE containing your symbol, a CASE in the SELECT list, etc. Performance is not dramatically decreased -- and it's often not decreased at all.

ABOUT THE AUTHOR
Michelle Gutzait works as a senior database consultant for ITERGY International Inc., an IT consulting firm specializing in the design, implementation, security and support of Microsoft products in the enterprise. Gutzait has been involved in IT for 20 years as a developer, business analyst and database consultant. For the last 10 years, she has worked exclusively with SQL Server. Her skills include SQL Server infrastructure design, database design, performance tuning, security, high availability, VLDBs, replication, T-SQL/packages coding, and more.

This was first published in August 2008

Dig deeper on Microsoft SQL Server Performance Monitoring and Tuning

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close