delete from `xxx` where id in (select * from (SELECT min(id) FROM `xxx` group by mail having count(mail)>1) as a); 这是百度文章里 提到的最多的的一条语句。 240W 的数据处理了 3 个多小时才删除了 19w 数据。
方法二
create table tmp_xxx select min(id) as id,mail,password from xxx group by mail,password; 同样的 240w 数据去重不到两分钟就搞定了。( 160w spend 56s )
这个方法二,语句,我的 table 是 xs,为什么以下代码会提示错误
create xs tmp_xxx select min(id) as id,mail,password from xs group by mail,password;
@petelin 如果要按照楼主第一种的需求来删除 那就是 DELETE T FROM MYTABLE T LEFT JOIN (SELECT MIN(A.ID) AS ID FROM MYTABLE A GROUP BY A.MAIL HAVING COUNT(A.MAIL) > 1) TT ON T.ID = TT.ID WHERE TT.ID IS NOT NULL 两种需求,一种是只保留一条不重复数据,另一种是删除重复数据中的一条,两种都并没有删除非重复的数据