Home > SQL Server Tips > Database Management and Administration > Replacing non-alphanumeric characters in strings using T-SQL
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATABASE MANAGEMENT AND ADMINISTRATION

Replacing non-alphanumeric characters in strings using T-SQL


Harinatha Reddy Gorla
06.11.2002
Rating: --- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


As a follow-up to my previous two articles (Complex grouping of strings in T-SQL and Working with delimited strings in T-SQL), here are two important methods for replacing non-alphanumeric characters in strings.

Method #1: Using a table of ASCII code values

This method removes all characters other than [a to z] , [A to Z] and [0 to 9]. In other words, it removes all symbols ("#?&/() and so on. You can do this by creating a table with all bad ASCII code values and cycling through a string, replacing the bad codes with an empty string.

--Create table with all bad codes
DECLARE @mycode INT
CREATE TABLE #badcodes(badcode INT)
SET @mycode=33
WHILE @mycode<=255
BEGIN
 IF (@mycode BETWEEN 33 AND 47) OR (@mycode BETWEEN 58 AND 64)
 OR (@mycode BETWEEN 58 AND 64) OR (@mycode BETWEEN 91 AND 96)
 OR (@mycode BETWEEN 123 AND 255)
 INSERT INTO #BADCODES VALUES(@mycode)
 SET @mycode=@mycode+1
END
GO
 
--strip out the characters
DECLARE @test VARCHAR(100)
SET @test='se+*ar%c&h^da#tab~se'
 
UPDATE #badcodes
SET @test=REPLACE(@test,CHAR(badcode),'')
 
PRINT @test

Method #2: Using specified patterns

For more info, check out SQL Server books online for help on "PATINDEX" and "STUFF".

DECLARE @pos INT
DECLARE @myString VARCHAR(20)
SET @myString='se+*ar%c&h^da#tab~se'
SET @pos = PATINDEX('%[^a-z0-9_]%',@myString)
-- sets @pos to the position in @myString of the first character
-- not (^) equal to a letter (a-z), a digit (0-9) or an underscore
-- or sets @pos to 0 if no such character exists in @myString (in which
-- case you are done)
 
WHILE @pos > 0 
BEGIN 
  SET @myString = STUFF(@myString,@pos,1,'')
  -- overwrites the non-alphanumeric character at position @pos with ''
  SET @pos = PATINDEX('%[^a-z0-9_]%',@myString)
  -- looks for the next non-alphanumeric character
END
 
SELECT @myString

An example: how to replace non-alphanumeric characters from a column

Let's create a test table first:

--////////////////////////////////
Create table tblTest(col1 varchar(20),col2 varchar(20))
insert into tblTest values('one$','00#7')
insert into tblTest values('tw*o','%420')
insert into tblTest values('thre@e','11&1')
 
--select * from tblTest
 
col1                 col2                 
-------------------- -------------------- 
one$                 00#7
tw*o                 %420
thre@e               11&1
 
--////////////////////////////////

Add one new column and insert the values by appending existing columns.

 
--////////////////////////////////
ALTER TABLE tblTest ADD Newcol varchar(40)
update tblTest set Newcol=col1+col2
--select * from tblTest
 
col1                 col2                 Newcol                                   
-------------------- -------------------- ---------------
one$                 00#7                 one$00#7
tw*o                 %420                 tw*o%420
thre@e               11&1                 thre@e11&1
 
--////////////////////////////////
 

Here's the solution. You can use one of the above two methods to remove non-alphanumeric characters from a table column. I have choosen Mehtod #2 to solve this example problem. You have to add row numbers to the table to loop through the table and update.

 
-- Add row number to each row
SELECT IDENTITY(int,1,1) AS ID, *
 INTO #tblAlpha
 FROM tblTest
 
--select * from #tblAlpha
--update new column 
DECLARE @MaxRank  int 
DECLARE @Counter  int 
declare @pos int
Declare @s varchar(100)
 
Set nocount on
SELECT @MaxRank = Max(ID) From #tblAlpha 
SET @Counter = 1
WHILE @Counter <= @MaxRank 
BEGIN 
 select @s=newcol from #tblAlpha where id=@Counter
 set @pos = patindex('%[^a-z0-9_]%',@s)
 while @pos > 0 
 begin
   set @s = stuff(@s,@pos,1,'')
   set @pos = patindex('%[^a-z0-9_]%',@s)
 end
 update #tblAlpha set newcol=@s where id=@Counter
 
SET @Counter = @Counter +1 
END 
 
truncate table tbltest
insert into tbltest select col1,col2,newcol from #tblAlpha
drop table #tblAlpha
 
select * from tbltest

Reader Feedback

Israel P. writes: This took over 24 hrs to run on my machine so it doesn't work for my purposes. I found this code much more effective ( it took only 35 mins to run):

while @@rowcount > 0
        update  user_list_original
        set     fname = replace(fname, substring(fname, patindex('%[^a-zA-Z ]%', fname), 1), '')
        where   patindex('%[^a-zA-Z ]%', fname) <> 0

Is there anything missing by using this? I have a table with millions of records that I need to 'clean'. Any pointers for cleaning data (emails, addresses, etc.)?

For More Information

  • What do you think about this tip? E-mail the Editor at tdichiara@techtarget.com with your feedback.
  • The Best SQL Server Web Links: tips, tutorials, scripts, and more.
  • Have an SQL Server tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical SQL Server questions--or help out your peers by answering them--in our live discussion forums.
  • Check out our Ask the Experts feature: Our SQL, Database Design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.

Rate this Tip
To rate tips, you must be a member of SearchSQLServer.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
SQL/Transact SQL (T-SQL)
Using DELETE and TRUNCATE TABLE statements to delete data in SQL Server
SQL language crash course (just enough to be dangerous)
Working with IntelliSense in SQL Server 2008 Management Studio
SQL Server Mailbag: Stored procedures, triggers and SSRS reports
Working with sparse columns in SQL Server 2008
Determining the source of full transaction logs in SQL Server
New GROUP BY option provides better data control in SQL Server 2008
Using the OPENROWSET function in SQL Server
Loading data files with SQL Server's BULK INSERT statement
Importing and exporting bulk data with SQL Server's bcp utility
SQL/Transact SQL (T-SQL) Research

Database Management and Administration
Password cracking tools for SQL Server
Using traces in SQL Server Profiler
Meet compliance requirements with improved database security practices
Hardening the network and OS for SQL Server security
Securing the server and database in SQL Server
How SQL Server 2008 components impact SharePoint implementations
Troubleshooting Distributed Transaction Coordinator errors in SQL Server
Achieving high availability and disaster recovery with SharePoint databases
Clearing the Windows page file and its effect on server performance
Deploying a SQL Server virtual appliance for Microsoft Hyper-V

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
ACID  (SearchSQLServer.com)
Collaboration Data Objects  (SearchSQLServer.com)
commit  (SearchSQLServer.com)
container  (SearchSQLServer.com)
DAO  (SearchSQLServer.com)
fetch  (SearchSQLServer.com)
OLE DB  (SearchSQLServer.com)
query  (SearchSQLServer.com)
SQL  (SearchSQLServer.com)
T-SQL  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



SQL Server Development - .NET, C#, T-SQL, Visual Basic
HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts