Home > SQL Server Tips > Database Management and Administration > Improving SQL Server full-text search performance
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATABASE MANAGEMENT AND ADMINISTRATION

Improving SQL Server full-text search performance


Matthew Schroeder, Contributor
05.07.2009
Rating: -3.33- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


With any text search, performance is the most important factor that should be addressed. What approach is going to run fastest for the business requirements and data content?

SQL Server full-text performance can be improved by doing the following best practices:

  • Reorganize weekly and rebuild the catalogs on at least a monthly basis (optimal configurations will vary by load). Failure to do so will lead to fragmentation of the full-text index and make indexes far less efficient.
  • The full-text key should be a small column. It's best to pick an int type, since that does not require a join to the DocID mapping table, thus improving performance. It's also faster if the key is a clustered index.
  • Using the TOP_N_BY_RANK argument increases query performance since it only processes the top n number of records.
  • Locate the catalogs on separate disks. I would also recommend 15K RPM drives.
  • The first parameter determines the execution plan, so you might want to use OPTIMIZE FOR to pass in an optimal value to get the correct execution plan.
  • I would recommend against specifying the specific index unless it's a last resort, since it generally causes problems as indexes change.
  • Ensure that sp_configure 'max full-text crawl range' is set to the number of CPUs that you have available.
  • Update statistics and place a second index on a timestamp column to improve incremental loads of the full-text population.
  • Rather then using multiple CONTAINS predicates, try to combine them into one CONTAINS predicate containing multiple columns.

Figure 1: Sample long running execution plan (Click to enlarge)

Typically, this type of execution plan will occur when the parameter passed in has a low degree of selectivity. In other words, it's usually easier to do a table scan since the query will probably return many rows compared to the total size of the table.

The cost of this approach is (table size 2,116,400 rows, 4.1gb data)
CPU: 3,885
Reads: 506,448
Duration: 95,221
Rows: 124, 784
Now let's say that we want to force a seek. We could tell the query to optimize for something that will return a smaller result set. For example, let's use the following query, which takes a query (performance) with a large result set and optimizes it for "test", which has a smaller result set. This causes the large query to use seeks as seen in the plan below:

declare @N nvarchar(30) set @N = N'*performance*'

select * from LargeProperty_test where contains(largeProp_value, @N) OPTION (OPTIMIZE FOR (@N = '*test*'))

Figure 2: Sample long running "OPTIMIZE FOR" execution plan (Click to enlarge)

Cost
CPU: 3,011
Reads: 1,112,473
Duration: 80,128
Rows: 124, 784

Figure 3: Sample selective running execution plan (Click to enlarge)

Cost
CPU: 500
Reads: 96,669
Duration: 1,477
Rows: 21,120

Comparison to Like Predicate Performance
CPU: 94,490
Reads: 670,105
Duration: 78,068
Rows: 21,120

This test compared a table size of 2,116,400 rows consisting of 4.1GB of data. The performance difference would narrow if the number of rows decreased or if the volume of text in the column was reduced. The performance difference would also drastically increase as the volume of text and/or the number of rows increased.

Hybrid approaches

In most scenarios, the CLR stored procedure approach will actually degrade your performance over these solutions rather than improve it, so approach this methodology with caution -- especially since scope creep will probably become a problem quickly.

Shredding the text columns by word into a clustered index that has a 100% fill-factor will drastically increase your speed, though at the expense of flexibility. This would be better suited as an optional feature or compliment to Microsoft's full-text search.

Caching search results from full-text search would do more than just provide you with the flexibility of the Microsoft tool. Although initial searches would be much slower, any subsequent searches on the same keywords would be returned instantly since the results would be cached in another table optionally with the keywords as the clustered index -- a huge benefit if keywords tend to be repeated often.

I would not solely rely on SQL Server full-text search unless longer response times are acceptable. I would instead recommend a hybrid approach utilizing full-text search and another option for busy systems.

For fast text searching, I would definitely recommend 15K RPM drives separated into different RAID configurations for all components, including catalogs, TempDB, dump, logs, data, non-clustered indexes, etc. If this proves to be too slow along with a hybrid solution, then you might need to look at solid state servers rather than utilizing hard drives.

ABOUT THE AUTHOR:   
Matthew Schroeder is a senior software engineer who works on SQL Server database systems ranging in size from 2 GB to 3+ TB, with between 2k and 40+k trans/sec. Schroeder currently works for the gaming vendor IGT providing services to gaming companies. He also works as an independent consultant, specializing in SQL Server, Oracle and .NET for industries such as gaming, automotive, e-commerce, entertainment, banking and nonprofit. Schroeder specializes in OLTP/OLAP DBMS systems as well as highly scalable processing systems written in .NET. He is a Microsoft certified MCITP, Database Developer, has a master's degree in computer science and more than 12 years of experience in SQL Server/Oracle. He can be reached at cyberstrike@aggressivecoding.com.

Rate this Tip
To rate tips, you must be a member of SearchSQLServer.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Microsoft SQL Server Performance Monitoring and Tuning
SQL Profiler: A network trace for SQL Server
SQL sprawl: Why is SQL Server Express installed everywhere?
Top 10 SQL Server tips of 2009
Top 5 SQL Server DBA tasks that are a waste of time
SQL Server Mailbag: Asymmetric encryption, log shipping issues
Using traces in SQL Server Profiler
SQL Server Mailbag: CALs, witnesses and unwanted changes
SQL Server Mailbag: Data restoration and DB property management
Working with IntelliSense in SQL Server 2008 Management Studio
SQL Server Mailbag: Stored procedures, triggers and SSRS reports

Database Management and Administration
Database encryption in SQL Server 2008: Improvements you finally need
Common oversights with SQL Server security audits
Top 5 SQL Server DBA tasks that are a waste of time
Password cracking tools for SQL Server
Using traces in SQL Server Profiler
Meet compliance requirements with improved database security practices
Hardening the network and OS for SQL Server security
Securing the server and database in SQL Server
How SQL Server 2008 components impact SharePoint implementations
Troubleshooting Distributed Transaction Coordinator errors in SQL Server

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
contiguity  (SearchSQLServer.com)
contiguous  (SearchSQLServer.com)
drilldown  (SearchSQLServer.com)
hashing  (SearchSQLServer.com)
hybrid online analytical processing  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



SQL Server Development - .NET, C#, T-SQL, Visual Basic
HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts