我有唯一的id
和电子邮件
字段。电子邮件重复。我只想保留所有重复项中的一个电子邮件地址,但使用最新的id
(最后插入的记录)。
我怎样才能做到这一点?
我有唯一的id
和电子邮件
字段。电子邮件重复。我只想保留所有重复项中的一个电子邮件地址,但使用最新的id
(最后插入的记录)。
我怎样才能做到这一点?
假设您的表test
包含以下数据:
select id, email
from test;
ID EMAIL
---------------------- --------------------
1 aaa
2 bbb
3 ccc
4 bbb
5 ddd
6 eee
7 aaa
8 aaa
9 eee
So, we need to find all repeated emails and delete all of them, but the latest id.
In this case, aaa
, bbb
and eee
are repeated, so we want to delete IDs 1, 7, 2 and 6.
要做到这一点,首先我们需要找到所有重复的电子邮件:
select email
from test
group by email
having count(*) > 1;
EMAIL
--------------------
aaa
bbb
eee
然后,从这个数据集中,我们需要找到这些重复电子邮件中每一封的最新id:
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email;
LASTID EMAIL
---------------------- --------------------
8 aaa
4 bbb
9 eee
最后,我们现在可以删除所有Id小于LASTID的电子邮件。因此,解决方案是:
delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;
我现在没有在这台机器上安装mySql,但应该可以工作
上面的删除工作,但我发现了一个更优化的版本:
delete test
from test
inner join (
select max(id) as lastId, email
from test
group by email
having count(*) > 1) duplic on duplic.email = test.email
where test.id < duplic.lastId;
您可以看到它删除了最旧的重复项,即1、7、2、6:
select * from test;
+----+-------+
| id | email |
+----+-------+
| 3 | ccc |
| 4 | bbb |
| 5 | ddd |
| 8 | aaa |
| 9 | eee |
+----+-------+
另一个版本是蕾妮·利蒙
delete from test
where id not in (
select max(id)
from test
group by email)
试试这个方法
DELETE t1 FROM test t1, test t2
WHERE t1.id > t2.id AND t1.email = t2.email
正确的方法是
DELETE FROM `tablename`
WHERE `id` NOT IN (
SELECT * FROM (
SELECT MAX(`id`) FROM `tablename`
GROUP BY `name`
)
)
DELETE
FROM
`tbl_job_title`
WHERE id NOT IN
(SELECT
*
FROM
(SELECT
MAX(id)
FROM
`tbl_job_title`
GROUP BY NAME) tbl)
修订版和工作版!!!谢谢@Gaurav
如果要保留id值最低的行:
DELETE n1 FROM yourTableName n1, yourTableName n2 WHERE n1.id > n2.id AND n1.email = n2.email
如果要保留具有最高id值的行:
DELETE n1 FROM yourTableName n1, yourTableName n2 WHERE n1.id < n2.id AND n1.email = n2.email
或者这个查询可能也有帮助
DELETE FROM `yourTableName`
WHERE id NOT IN (
SELECT * FROM (
SELECT MAX(id) FROM yourTableName
GROUP BY name
)
)
我个人对投票前两名的答案有意见。这不是最干净的解决方案,但您可以使用临时表来避免MySQL通过在同一个表上联接来删除的所有问题。
CREATE TEMPORARY TABLE deleteRows;
SELECT MIN(id) as id FROM myTable GROUP BY myTable.email;
DELETE FROM myTable
WHERE id NOT IN (SELECT id FROM deleteRows);
我必须说,优化版本是一段甜美优雅的代码,即使在DATETIME列上进行比较,它也能像魅力一样发挥作用。这是我在脚本中使用的内容,我在其中搜索每个EmployeeID的最新合同结束日期:
DELETE CurrentContractData
FROM CurrentContractData
INNER JOIN (
SELECT
EmployeeID,
PeriodofPerformanceStartDate,
max(PeriodofPerformanceEndDate) as lastDate,
ContractID
FROM CurrentContractData
GROUP BY EmployeeID
HAVING COUNT(*) > 1) Duplicate on Duplicate.EmployeeID = CurrentContractData.EmployeeID
WHERE CurrentContractData.PeriodofPerformanceEndDate < Duplicate.lastDate;
非常感谢!
DELIMITER //
CREATE FUNCTION findColumnNames(tableName VARCHAR(255))
RETURNS TEXT
BEGIN
SET @colNames = "";
SELECT GROUP_CONCAT(COLUMN_NAME) FROM INFORMATION_SCHEMA.columns
WHERE TABLE_NAME = tableName
GROUP BY TABLE_NAME INTO @colNames;
RETURN @colNames;
END //
DELIMITER ;
DELIMITER //
CREATE PROCEDURE deleteDuplicateRecords (IN tableName VARCHAR(255))
BEGIN
SET @colNames = findColumnNames(tableName);
SET @addIDStmt = CONCAT("ALTER TABLE ",tableName," ADD COLUMN id INT AUTO_INCREMENT KEY;");
SET @deleteDupsStmt = CONCAT("DELETE FROM ",tableName," WHERE id NOT IN
( SELECT * FROM ",
" (SELECT min(id) FROM ",tableName," group by ",findColumnNames(tableName),") AS tmpTable);");
set @dropIDStmt = CONCAT("ALTER TABLE ",tableName," DROP COLUMN id");
PREPARE addIDStmt FROM @addIDStmt;
EXECUTE addIDStmt;
PREPARE deleteDupsStmt FROM @deleteDupsStmt;
EXECUTE deleteDupsStmt;
PREPARE dropIDStmt FROM @dropIDStmt;
EXECUTE dropIDstmt;
END //
DELIMITER ;
我创建了一个很好的存储过程,用于删除表的所有重复记录,而不需要该表上现有的唯一id。
CALL deleteDuplicateRecords("yourTableName");
我想删除表中基于多列的重复记录,所以这种方法对我有效,
步骤1-从双工记录中获取最大id或唯一id
select * FROM ( SELECT MAX(id) FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) > 1
步骤2-从表中获取单个记录的ID
select * FROM ( SELECT id FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) = 1
步骤3-从删除到排除以上2个查询
DELETE FROM `table_name`
WHERE
id NOT IN (paste step 1 query) a //to exclude duplicate records
and
id NOT IN (paste step 2 query) b // to exclude single records
最终查询:-
DELETE FROM `table_name`
WHERE id NOT IN (
select * FROM ( SELECT MAX(id) FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) > 1) a
)
and id not in (
select * FROM ( SELECT id FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) = 1) b
);
通过此查询,只有重复定位的记录才会被删除。
请尝试以下解决方案(基于@Jose Rui Santos回答的评论):
-- Set safe mode to false since;
-- You are using safe update mode and tried to update a table without a WHERE that uses a KEY column
SET SQL_SAFE_UPDATES = 0;
-- Delete the duplicate rows based on the field_with_duplicate_values
-- Keep the unique rows with the highest id
DELETE FROM table_to_deduplicate
WHERE id NOT IN (
SELECT * FROM (
-- Select the highest id grouped by the field_with_duplicate_values
SELECT MAX(id)
FROM table_to_deduplicate
GROUP BY field_with_duplicate_values
)
-- Subquery and alias needed since;
-- You can t specify target table table_to_deduplicate for update in FROM clause
AS table_sub
);
-- Set safe mode to true
SET SQL_SAFE_UPDATES = 1;
I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...
<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...
我把我的用心从使用QQL转向MySQL。 它与凯科特合作,现在不工作,因为我已经改变,使用MySQL。 这里的错误信息是:
We have a restaurant table that has lat-long data for each row. We need to write a query that performs a search to find all restaurants within the provided radius e.g. 1 mile, 5 miles etc. We have ...
Is it possible in php to return a specific row of data from a mysql query? None of the fetch statements that I ve found return a 2 dimensional array to access specific rows. I want to be able to ...
Our website was developed with a meta tag set to... <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> This works fine for M-dashes and special quotes, etc. However, I ...
What are some of the strategies being used for pagination of data sets that involve complex queries? count(*) takes ~1.5 sec so we don t want to hit the DB for every page view. Currently there are ~...
My table looks like person_id | car_id | miles ------------------------------ 1 | 1 | 100 1 | 2 | 200 2 | 3 | 1000 2 | 4 | 500 I need to ...