1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[MySQL]Remove spaces and delete dupes

Discussion in 'HTML, Graphics & Programming' started by LazyManc, 24 Jan 2006.

  1. LazyManc

    Wise Guy

    Joined: 21 May 2003

    Posts: 1,365

    I've got a database table with a couple of hundred thousand records. The table has a unique field, "tel1" which holds telephone numbers (as a varchar).

    Some of the numbers are formatted with spaces, i.e. "0845 000 0000"

    and some aren't i.e. "08450000000".

    I'd like to remove all the spaces, and any duplicate records.

    I've tried
    Code:
    UPDATE table SET tel1 = replace(tel1, " ", "");
    but that gives an error if there is a duplicate record (because tel1 is unique).

    Any way around this?
     
  2. Dr_Evil

    Mobster

    Joined: 7 Jun 2005

    Posts: 3,036

    Location: SE London

    You'll have to remove the duplicates first, by doing something like this:

    DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND B.tel1 = A.tel1)

    then your statement to remove the spaces will work.
     
  3. LazyManc

    Wise Guy

    Joined: 21 May 2003

    Posts: 1,365

    The problem is they aren't dupes until the phone number has had the spaces removed... so I can't match them like that unfortunately :(
     
  4. Dr_Evil

    Mobster

    Joined: 7 Jun 2005

    Posts: 3,036

    Location: SE London

    Oh yeah, ofcourse! Try this then:

    DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND Replace(B.tel1, " ", "") = Replace(A.tel1, " ", ""))
     
  5. LazyManc

    Wise Guy

    Joined: 21 May 2003

    Posts: 1,365

    I made a slight adjustment (one of the tel1's will be the space-less version?) but I get an error when trying to run:

    Code:
    mysql> DELETE FROM test A WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", "") = A.tel1);
    ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", ""' at line 1
    
     
  6. Dj_Jestar

    Caporegime

    Joined: 18 Oct 2002

    Posts: 28,820

    Location: Back in East London

    Could also try:

    Code:
    DELETE FROM `table1` WHERE REPLACE(' ', '', `table1`.`tel1`) IN (SELECT `table1`.`tel1` FROM `table1`);
    Or try the ANY operator instead of IN, both are untested on my part though.

    Then run the first query you posted to remove the spaces in the cells that are not dupes.
     
    Last edited: 24 Jan 2006
  7. LazyManc

    Wise Guy

    Joined: 21 May 2003

    Posts: 1,365

    mysql> DELETE FROM `test` WHERE Replace(' ', '', `test`.`tel1`) IN (SELECT `test`.`tel1` FROM `test`);
    ERROR 1093 (HY000): You can't specify target table 'test' for update in FROM clause


    Apologies if i'm not much use here, my SQL experience is pretty much limited to inserts and updates.
     
    Last edited: 24 Jan 2006
  8. LazyManc

    Wise Guy

    Joined: 21 May 2003

    Posts: 1,365

    Ok,

    SELECT `id`, `tel1` FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

    Gives the correct records i.e. the dupes.

    but

    DELETE FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

    gives a syntax error.
     
  9. oneilldo

    Wise Guy

    Joined: 16 Jan 2003

    Posts: 1,789

    Can't be bothered to sort your syntax our but if original select gets what rows you need then I would just do a select into and then a delete (presuming this is a one off).
    SELECT `id`, `tel1` INTO 'DupRows' FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

    Then do DELETE FROM test WHERE id IN (SELECT id from DupRows)