algorithm notes: leetcode#811 subdomain visit count

Problem


A website domain like “discuss.leetcode.com” consists of various subdomains. At the top level, we have “com”, at the next level, we have “leetcode.com”, and at the lowest level, “discuss.leetcode.com”. When we visit a domain like “discuss.leetcode.com”, we will also visit the parent domains “leetcode.com” and “com” implicitly.

Now, call a “count-paired domain” to be a count (representing the number of visits this domain received), followed by a space, followed by the address. An example of a count-paired domain might be “9001 discuss.leetcode.com”.

We are given a list cpdomains of count-paired domains. We would like a list of count-paired domains, (in the same format as the input, and in any order), that explicitly counts the number of visits to each subdomain.

Example 1:
Input:
[“9001 discuss.leetcode.com”]
Output:
[“9001 discuss.leetcode.com”, “9001 leetcode.com”, “9001 com”]
Explanation:
We only have one website domain: “discuss.leetcode.com”. As discussed above, the subdomain “leetcode.com” and “com” will also be visited. So they will all be visited 9001 times.

Example 2:
Input:
[“900 google.mail.com”, “50 yahoo.com”, “1 intel.mail.com”, “5 wiki.org”]
Output:
[“901 mail.com”,“50 yahoo.com”,“900 google.mail.com”,“5 wiki.org”,“5 org”,“1 intel.mail.com”,“951 com”]
Explanation:
We will visit “google.mail.com” 900 times, “yahoo.com” 50 times, “intel.mail.com” once and “wiki.org” 5 times. For the subdomains, we will visit “mail.com” 900 + 1 = 901 times, “com” 900 + 50 + 1 = 951 times, and “org” 5 times.

Notes:

  • The length of cpdomains will not exceed 100.
  • The length of each domain name will not exceed 100.
  • Each address will have either 1 or 2 “.” characters.
  • The input count in any count-paired domain will not exceed 10000.
  • The answer output can be returned in any order.

Solution


Basic idea

Use hash table to store relationship between subdomains and its visit count. Traverse each string in the given list, and find the visit count and the domain. Then get all subdomains of the domain and add the visit count to the values in the hash table.

Python implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class :
def subdomainVisits(self, cpdomains):
"""
:type cpdomains: List[str]
:rtype: List[str]
"""
lookup = {}
for s in cpdomains:
num, domain = s.split(' ')
sub = domain.split('.')
for i in range(len(sub)):
subdomain = '.'.join(sub[i:])
lookup[subdomain] = lookup.get(subdomain, 0) + int(num)
return [str(n)+" "+d for d, n in lookup.items()]

Java implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class {
public List<String> subdomainVisits(String[] cpdomains) {
Map<String, Integer> lookup = new HashMap<>();
for(String cpdomain : cpdomains){
String[] temp = cpdomain.split(" ");
String[] sub = temp[1].split("\.");
for(int i = 0; i < sub.length; i++){
String subdomain = String.join(".", Arrays.copyOfRange(sub, i, sub.length));
int new_num = lookup.getOrDefault(subdomain, 0) + Integer.parseInt(temp[0]);
lookup.put(subdomain, new_num);
}
}
List<String> ans = new ArrayList<>();
for(String domain : lookup.keySet()){
ans.add(Integer.toString(lookup.get(domain)) + " " + domain);
}
return ans;
}
}

Time complexity analysis

O(N), N is the length of given list. Because we have to traverse the given list, and each address has 1 or 2 ‘.’ characters.

Space complexity analysis

O(N), N is the length of given list. Use a hash table to store subdomains and visit count. Each address has at most 3 subdomains, so the number of subdomains depends on the number of domains in the given list.


811. Subdomain Visit Count
811. 子域名访问计数
(中文版) 算法笔记: 力扣#811 子域名访问计数