[MySQL]Remove spaces and delete dupes

Associate
Joined
21 May 2003
Posts
1,365
I've got a database table with a couple of hundred thousand records. The table has a unique field, "tel1" which holds telephone numbers (as a varchar).

Some of the numbers are formatted with spaces, i.e. "0845 000 0000"

and some aren't i.e. "08450000000".

I'd like to remove all the spaces, and any duplicate records.

I've tried
Code:
UPDATE table SET tel1 = replace(tel1, " ", "");

but that gives an error if there is a duplicate record (because tel1 is unique).

Any way around this?
 
Soldato
Joined
7 Jun 2005
Posts
3,035
Location
SE London
You'll have to remove the duplicates first, by doing something like this:

DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND B.tel1 = A.tel1)

then your statement to remove the spaces will work.
 
Associate
OP
Joined
21 May 2003
Posts
1,365
The problem is they aren't dupes until the phone number has had the spaces removed... so I can't match them like that unfortunately :(
 
Soldato
Joined
7 Jun 2005
Posts
3,035
Location
SE London
Oh yeah, ofcourse! Try this then:

DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND Replace(B.tel1, " ", "") = Replace(A.tel1, " ", ""))
 
Associate
OP
Joined
21 May 2003
Posts
1,365
I made a slight adjustment (one of the tel1's will be the space-less version?) but I get an error when trying to run:

Code:
mysql> DELETE FROM test A WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", "") = A.tel1);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", ""' at line 1
 
Caporegime
Joined
18 Oct 2002
Posts
29,490
Location
Back in East London
Could also try:

Code:
DELETE FROM `table1` WHERE REPLACE(' ', '', `table1`.`tel1`) IN (SELECT `table1`.`tel1` FROM `table1`);

Or try the ANY operator instead of IN, both are untested on my part though.

Then run the first query you posted to remove the spaces in the cells that are not dupes.
 
Last edited:
Associate
OP
Joined
21 May 2003
Posts
1,365
mysql> DELETE FROM `test` WHERE Replace(' ', '', `test`.`tel1`) IN (SELECT `test`.`tel1` FROM `test`);
ERROR 1093 (HY000): You can't specify target table 'test' for update in FROM clause


Apologies if i'm not much use here, my SQL experience is pretty much limited to inserts and updates.
 
Last edited:
Associate
OP
Joined
21 May 2003
Posts
1,365
Ok,

SELECT `id`, `tel1` FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

Gives the correct records i.e. the dupes.

but

DELETE FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

gives a syntax error.
 
Associate
Joined
16 Jan 2003
Posts
1,913
Can't be bothered to sort your syntax our but if original select gets what rows you need then I would just do a select into and then a delete (presuming this is a one off).
SELECT `id`, `tel1` INTO 'DupRows' FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

Then do DELETE FROM test WHERE id IN (SELECT id from DupRows)
 
Back
Top Bottom