
KMP is hard to approach during interview
easier approach for O(m+n) time
assume we want to find if string t = “cde” is substring of string s = “abcde”
if we can reduce time to get each substirng oif length n of source string s to O(1)
the remaining time complexity is O(n) * O(1) = O(n)
hash function:
31 here is magic number
use ASCII code of char * 31^(length of s - index of char)
since the value is very large(long string)
mod it (like 10^6)
each time get a new substring
current_value -= int(first char) 31^(length of s - index of char)
if current_value < 0:
current_value += 10^6
current_value += int(new char) 31^(length of s - index of char)
def (self, source, target):
# error case
if source is None or target is None:
return -1
m = len(source)
n = len(target)
if n == 0:
return 0
# set base value and a mod value
# mod value might be prime number to reduce conflict of hash
base = 33
mod = 10000007
highest_mult = 1
# power for current first char code
for i in range(0, n-1):
highest_mult = highest_mult * base % mod
# calculate hash value for target string for matching
hash_target = 0
for i in range(0, n):
hash_target = (hash_target * base + ord(target[i]) - ord('a')) % mod
if hash_target < 0:
hash_target += mod
# calculate hash value for substring in source
value = 0
for i in range(0, m):
# when current substring length >= target length, remove first
if i >= n:
value = (value - highest_mult * (ord(source[i - n]) - ord('a'))) % mod
# add the new char code
value = (value * base + ord(source[i]) - ord('a')) % mod
if value < 0:
value += mod
# compare by char code, if match, means two substring is same
if i >= n - 1 and value == hash_target:
return i - n + 1
return -1




近期评论