[leetcode/db] 182 duplicate emails

Write a SQL query to find all duplicate emails in a table named Person.

Problem

+----+---------+
| Id | Email   |
+----+---------+
| 1  | [email protected] |
| 2  | [email protected] |
| 3  | [email protected] |
+----+---------+

Write a SQL query to find all duplicate emails in a table named Person.
For example, your query should return the following for the above table:

+---------+
| Email   |
+---------+
| [email protected] |
+---------+

Solution

Solution 1: Using WHERE and a temporary table

SELECT Email FROM
(
SELECT Email, count(Email) AS cnt
FROM Person
GROUP BY Email
) AS Statisitcs
WHERE cnt > 1

Solution 2: Using HAVING

SELECT Email
FROM Person
GROUP BY EMAIL
HAVING count(Email) > 1

HAVING

Let’s look at the doc for HAVING:

HAVING specifies a search condition for a group or an aggregate. HAVING can be used only with the SELECTstatement. HAVING is typically used with a GROUP BY clause. When GROUP BY is not used, there is an implicit single, aggregated group.

The syntax for HAVING is as followed:

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value;

Conclusion

Duplicated emails existed more than one time. To count the times each email exists, we can use GROUP BY to count the apperance of every email.
While we could use a temporary table to do the selecting job, a more common way to add a condition to groups or aggregates is to use the HAVING clause, which is much simpler and more efficient. So Solution 2 is better.

Reference

1. SELECT - HAVING (Transact-SQL)
2. SQL HAVING Clause(Chinese)
3. Duplicate Emails Solution