
Write a SQL query to find all duplicate emails in a table named Person.
Problem
+----+---------+
| Id | Email |
+----+---------+
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
+----+---------+
Write a SQL query to find all duplicate emails in a table named Person.
For example, your query should return the following for the above table:
+---------+
| Email |
+---------+
| [email protected] |
+---------+
Solution
Solution 1: Using WHERE and a temporary table
SELECT Email FROM
(
SELECT Email, count(Email) AS cnt
FROM Person
GROUP BY Email
) AS Statisitcs
WHERE cnt > 1
Solution 2: Using HAVING
SELECT Email
FROM Person
GROUP BY EMAIL
HAVING count(Email) > 1
Related Knowledge
HAVING
Let’s look at the doc for HAVING:
HAVINGspecifies a search condition for a group or an aggregate. HAVING can be used only with theSELECTstatement.HAVINGis typically used with aGROUP BYclause. WhenGROUP BYis not used, there is an implicit single, aggregated group.
The syntax for HAVING is as followed:
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value;
Conclusion
Duplicated emails existed more than one time. To count the times each email exists, we can use GROUP BY to count the apperance of every email.
While we could use a temporary table to do the selecting job, a more common way to add a condition to groups or aggregates is to use the HAVING clause, which is much simpler and more efficient. So Solution 2 is better.
Reference
1. SELECT - HAVING (Transact-SQL)
2. SQL HAVING Clause(Chinese)
3. Duplicate Emails Solution




近期评论