The fact that you have duplicate rows indicates you have no primary or unique key which is also an indication of
bad database design. The easiest way to solve this problem is to add a unique key to your table, either temporarily or permanently, so that individual rows can be identified.
So if for example you have a table with the following structure
CREATE TABLE tblDuplicates ( col1 INT, col2 CHAR(2), col3 MONEY )And you INSERT the following duplicate rows within your database:
INSERT tblDuplicates VALUES(1,'AB',123) INSERT tblDuplicates VALUES(1,'AB',123) INSERT tblDuplicates VALUES(1,'AB',123) INSERT tblDuplicates VALUES(2,'CD',456) INSERT tblDuplicates VALUES(2,'CD',456) INSERT tblDuplicates VALUES(3,'EF',789) INSERT tblDuplicates VALUES(4,'GH',912)When we SELECT from this table we can see more clearly that we have a problem.
SELECT * FROM tblDuplicates col1 col2 col3 ----------- ---- --------------------- 1 AB 123.0000 1 AB 123.0000 1 AB 123.0000 2 CD 456.0000 2 CD 456.0000 3 EF 789.0000 4 GH 912.0000However now if we add a column to our table that contains a unique value for every row we will be able to identify each row individually. The easiest way to do this is with an IDENTITY column.
ALTER TABLE tblDuplicates ADD colUnique INT IDENTITY(1,1)Now when we SELECT from our table we can see that the new column in each row, including the duplicate rows, has a unique value.
SELECT * FROM tblDuplicates col1 col2 col3 colUnique ----------- ---- --------------------- ----------- 1 AB 123.0000 1 1 AB 123.0000 2 1 AB 123.0000 3 2 CD 456.0000 4 2 CD 456.0000 5 3 EF 789.0000 6 4 GH 912.0000 7Now we can identify each row individually we can do something about removing the duplicated rows. The following DELETE statement removes all rows that have columns that match exactly with another row, except the row that has the minimum colUnqiue value (thereby always leaving one row remaining after the delete for each set of duplicated rows).
DELETE tblDuplicates FROM tblDuplicates td1 WHERE colUnique<>( SELECT MIN(colUnique) FROM tblDuplicates td2 WHERE td1.Col1 = td2.Col1 AND td1.Col2 = td2.Col2 AND td1.Col3 = td2.Col3 )We can now remove our temporary unique column:
ALTER TABLE tblDuplicates DROP COLUMN colUniqueAnd again we SELECT from our table to retrieve all new set of non-duplicated rows.
SELECT * FROM tblDuplicates col1 col2 col3 ----------- ---- --------------------- 1 AB 123.0000 2 CD 456.0000 3 EF 789.0000 4 GH 912.0000However as our table doesn't enforce uniqueness this situation is likely to happen again. You best bet is to rework you database design to include a permanent primary key on the table so you are always able to individually identify rows. And always, always make sure you have a good backup before running this type of operation on your tables as it is easy to make a mistake.
Dig deeper on Microsoft SQL Server Installation
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.