December 12th, 2006

Remove duplicate entries / rows a mySQL database table

This is an extremely quick and painless way to remove duplicate rows (tuples) from a MySQL database table. The best part of it is that it requires no programming or PHP coding whatsoever, it can all be done with three manual SQL queries! Note: this only works on MySQL 3.23 and above. But I have a hard time imagining anyone running an older version than that anyways!

So, there are 3 steps, and therefore 3 SQL statements:

Step 1: Move the non duplicates (unique tuples) into a temporary table

CREATE TABLE new_table AS
SELECT * FROM old_table WHERE 1 GROUP BY [COLUMN TO remove duplicates BY];

Step 2: delete delete the old table
We no longer need the table with all the duplicate entries, so drop it!

DROP TABLE old_table;

Step 3: rename the new_table to the name of the old_table

RENAME TABLE new_table TO old_table;

And of course, don't forget to fix your buggy code to stop inserting duplicates!

17 Responses to ' Remove duplicate entries / rows a mySQL database table '

Subscribe to comments with RSS or TrackBack to ' Remove duplicate entries / rows a mySQL database table '.

  1. Carpool Guy said,

    on June 5th, 2007 at 8:56 pm

    just use "insert IGNORE into …." to prevent duplicate entries, or add an "ON DUPLICATE KEY UPDATE …" at the end of your mysql query (http://dev.mysql.com/doc/refman/5.0/en/insert.html).

  2. rudix said,

    on October 6th, 2007 at 4:04 am

    thank you dude , this helps me a lot, there was a problem with autoindex and my site was 12 h with 1 record, but this is the best solution I found thanks again

  3. Bijay Rungta said,

    on October 10th, 2007 at 1:21 pm

    Hey Thanks for sharing man..
    Was looking for a solution like this .and it was so quick workaround I hadn't expected this…

  4. Louis said,

    on October 12th, 2007 at 10:32 am

    I want to add my thanks also. I too have been trying to figure out how to do this for a while and was embarrassed by how simple it is. Cheers!

  5. tom said,

    on November 5th, 2007 at 12:32 pm

    Worked like a charm and cleaned my db in a matter of minutes.

    Thanks for posting it.

    Tom

  6. amit sankhala said,

    on December 18th, 2007 at 2:01 am

    Hi,
    Is it possible to find duplicate enteries in one query only?

    Let me know if any one knows this.


  7. on December 26th, 2007 at 1:09 pm

    Be careful with this approach, as dropping the new table will not have any of the keys or settings the old table had, so don't be too quick dropping it.

    Also, what's the point of 'WHERE 1' when you can just omit it?

    And instead of just fixing buggy code, put a UNIQUE index on the column that gets duplicated.

  8. Lucas Vasconcelos said,

    on January 26th, 2008 at 11:41 am

    Hey, thanks a lot for your tip.

    I use then with Mysql 4.1 and i don't need use WHERE clause…

    []'s

  9. majorxp said,

    on January 30th, 2008 at 1:53 am

    This is potentially problematic for many, many reasons. This includes indexes, table properties, and no control of which dupes you delete/keep.

    I would first use UNIQUE as an index as suggested above. I would also use this query to see how many duplicates I have too (before trying to fix it).

    >> SELECT id, count(*) AS n FROM table GROUP BY id HAVING n>1

    I would suggest using that to selectivly delete records manually.

  10. vijaykumar said,

    on February 21st, 2008 at 11:05 am

    i have a table with 4 columns having duplicate entries.i want to remove that entries when rows of four columns should exactly match bcoz if any one differs it not duplicate,can anyone send query for this?

  11. Dylan said,

    on May 6th, 2008 at 3:30 pm

    Using a correlated subquery you can select the duplicated rows and then manually delete the duplicates.

    Example code follows. In this instance the 'real' primary key should be ClientID, Engine, ScheduledKeywordJobsID and they are the group by in the subquery.

    select *from ClientVisibility where ScheduledKeywordJobsID in ( SELECT `ScheduledKeywordJobsID` FROM `ClientVisibility` WHERE 1 group by ClientID , Engine , `ScheduledKeywordJobsID` having count(*)>1 ) ORDER BY `ClientVisibility`.`ScheduledKeywordJobsID`, Engine,ClientID ASC

  12. Lone said,

    on June 17th, 2008 at 3:55 pm

    Nice one mate.

    I searched in google, and the first entry was yours. and it worked like a charm. Top spot well earned.

  13. palm_homes said,

    on July 16th, 2008 at 6:05 pm

    beautiful 3 line SQL code that saved me lots of time . THANK YOU

  14. kapil said,

    on August 14th, 2008 at 11:25 am

    Thank you dude , thankx a lot i was looking for same kind of simple and unique way to solve my problem.

  15. bardh7 said,

    on September 16th, 2008 at 1:21 pm

    this could also help

    before :

    id | name
    ————-
    1 | test |
    2 | test |
    3 | php |
    4 | echo |

    after:

    id | name|
    ————
    2 | test |
    3 | php |
    4 | echo |

  16. S.Manikandan Siva said,

    on September 24th, 2008 at 6:30 am

    Thanks a lot dude… It helps me to remove my burden.. Thanks again,
    S Manikandan Siva
    Mumbai-21.

  17. paulw said,

    on October 3rd, 2008 at 10:35 pm

    thanks for the info. after searching and trying, i'm adding my 2 cents.

    another q&d way to remove dup recs… here goes:

    ALTER IGNORE TABLE dups ADD UNIQUE INDEX dupidx (dup_fld1 [, dup_fldx]);
    ALTER TABLE dups DROP INDEX dupidx

    advantages:
    - simple script
    - maintains uniqueid
    - maintains table config (ie indexes)

    disadvantages:
    - arbitrary deletion of dups

    hope this helps someone

Leave a reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*
To prove you're a person (not a spam script), type the security word shown in the picture.
Anti-Spam Image