top k frequent element

Leetcode 347. Top K Frequent Elements

Given a non-empty array of integers, return the k most frequent elements.

For example,
Given [1,1,1,2,2,3] and k = 2, return [1,2].

Note:

  • You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
  • Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.

  • Assume n element
  • m distinct element
  • f is largest frequency

Method 1. sort Entry:

Time O(n + mlogm + klogm)

Space O(m + k)

1
2
3
4
5
6
7
8
9
10
11
12
13
  public List<Integer> (int[] nums, int k) {
List<Integer> result = new ArrayList<>();
Map<Integer,Integer> map = new HashMap<>();
for(int num : nums){
map.put(num,map.getOrDefault(num,0)+1);
}
PriorityQueue<Map.Entry<Integer, Integer>> heap = new PriorityQueue<>((o1,o2)->o2.getValue() - o1.getValue());
for(Map.Entry<Integer,Integer> entry : map.entrySet())
heap.offer(entry);
while(k-- > 0)
result.add(heap.poll().getKey());
return result;
}

Method 2. bucket sort:

Time O(n + k) = O(n)

Space O(n + k)

Optimize: if find the max frequency f, Space complexity is O(f + k)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public List<Integer> (int[] nums, int k) {
List<Integer> result = new ArrayList<>();
Map<Integer,Integer> map = new HashMap<>();
for(int num : nums){
map.put(num,map.getOrDefault(num,0)+1);
}
List<Integer>[] buckets = new List[nums.length+1];
for(Integer key : map.keySet()){
int count = map.get(key);
if(buckets[count] == null)
buckets[count] = new ArrayList<>();
buckets[count].add(key);
}
for(int i=buckets.length-1; i>=0 && k>0; --i){
if(buckets[i] != null)
for(int val : buckets[i]){
result.add(val);
k--;
}
}
return result;
}

Method 3. build heap in O(m)

Time O(n + m + klogk)

Space O(m + k)

Build map -> build heap from entry collection -> pop top k

Method 4. min heap size k

Time O(n + k + (m-k)logk + klogk)

Space O(m + k)

Build map -> build heap size k in O(k) -> iterate over the rest -> pop k result

Method 3. tree map:

Time O(n + mlogm + klogm )

Space O(m + k)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
  public List<Integer> (int[] nums, int k) {
List<Integer> result = new ArrayList<>();
Map<Integer,Integer> map = new HashMap<>();
for(int num : nums){
map.put(num,map.getOrDefault(num,0)+1);
}
TreeMap<Integer,List<Integer>> treeMap = new TreeMap<>();
for(int key : map.keySet()){
int freq = map.get(key);
List<Integer> list = treeMap.get(freq);
if(list == null){
list = new ArrayList<>();
treeMap.put(freq,list);
}
list.add(key);
}
while(result.size() < k){
Map.Entry<Integer,List<Integer>> entry = treeMap.pollLastEntry();
result.addAll(entry.getValue());
}
return result;
}

Leetcode 692. Top K Frequent Words

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

1
2
3
4
Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

1
2
3
4
Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
with the number of occurrence being 4, 3, 2 and 1 respectively.

  • Assume n element
  • m distinct element

Method 1. sort entry using heap:

Time O(n + mlogm + klogm )

Space O(m)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
 public List<String> (String[] words, int k) {
List<String> result = new ArrayList<>();
Map<String,Integer> map = new HashMap<>();
for(String word : words)
map.put(word,map.getOrDefault(word,0)+1);
PriorityQueue<Map.Entry<String, Integer>> heap = new PriorityQueue<>(new Comparator<Map.Entry<String, Integer>>(){

public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
if(o1.getValue() != o2.getValue())
return o2.getValue() - o1.getValue();
return o1.getKey().compareTo(o2.getKey());
}
});
for(Map.Entry<String,Integer> entry : map.entrySet())
heap.offer(entry);
while(k-- > 0)
result.add(heap.poll().getKey());
return result;
}

Method 2. better heap (use min heap instead of max heap):

Time O(n + mlogk + klogk)

Space O(m + k)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 public List<String> (String[] words, int k) {
List<String> result = new ArrayList<>();
Map<String,Integer> map = new HashMap<>();
for(String word : words)
map.put(word,map.getOrDefault(word,0)+1);
PriorityQueue<Map.Entry<String, Integer>> heap = new PriorityQueue<>(new Comparator<Map.Entry<String, Integer>>(){

public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
if(o1.getValue() != o2.getValue())
return o1.getValue() - o2.getValue();
return o2.getKey().compareTo(o1.getKey());
}
});
for(Map.Entry<String,Integer> entry : map.entrySet()){
heap.offer(entry);
if(heap.size() > k)
heap.poll();
}
while(!heap.isEmpty())
result.add(heap.poll().getKey());
Collections.reverse(result);
return result;
}

Method 3. bucket sort:

Time O(n + m + nlogn) worst

Space O(n + m)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public List<String> topKFrequent(String[] words, int k) {
List<String> result = new ArrayList<>();
Map<String,Integer> map = new HashMap<>();
for(String word : words)
map.put(word,map.getOrDefault(word,0)+1);
List<String>[] buckets = new List[words.length+1];
for(String key : map.keySet()){
int freq = map.get(key);
if(buckets[freq] == null)
buckets[freq] = new ArrayList<>();
buckets[freq].add(key);
}
for(int i=buckets.length-1; i>=0 && result.size()<k; --i){
List<String> list = buckets[i];
if(list != null){
Collections.sort(list);
for(int j=0; j<list.size() && result.size()<k; ++j)
result.add(list.get(j));
}
}
return result;
}