These are three problem about Word Break. Dynamic Programming or other fancy tricks can applied to solve and optimize these problems.
Description
Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
Note:
The same word in the dictionary may be reused multiple times in the segmentation.
You may assume the dictionary does not contain duplicate words.
Example
1 2 3
Input: s = "leetcode", wordDict = ["leet", "code"] Output: true Explanation: Return true because "leetcode" can be segmented as "leet code".
Solution
Actually, DFS or BFS can be applied to solve the problem. So I wrote a DFS version recursively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
class : defwordBreak(self, s, wordDict): """ :type s: str :type wordDict: List[str] :rtype: bool """ if len(s)==0: returnTrue for item in wordDict: if s.startswith(item): if self.wordBreak(s[len(item):], wordDict): returnTrue returnFalse
However, there is a obvious loophole. That is you can prune your search tree by memorizing the specific position which has been visited.
So, I got the following solution:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
class : defwordBreak(self, s, wordDict): start = 0 stack = [start] visited = set() while stack: start = stack.pop() if start in visited: continue visited.add(start) for word in wordDict: if s[start:].startswith(word): x = len(word) if x == len(s[start:]): returnTrue stack.append(start + x) returnFalse
Furthermore, how to return all the possible sentences?
Just like:
1 2 3 4 5 6 7 8
Input: s = "catsanddog" wordDict = ["cat", "cats", "and", "sand", "dog"] Output: [ "cats and dog", "cat sand dog" ]
It is just a extension about the former problem. However, It is not that easy to implement.
class : defwordBreak(self, s, wordDict): """ :type s: str :type wordDict: List[str] :rtype: List[str] """ # maxlen is the max word length used to minimize the window to scan wordd, maxlen = {}, 0 for i, word in enumerate(wordDict): wordd[word] = i maxlen = max(maxlen, len(word)) n = len(s) # dp[i] stores the indices of the last words that s[:i] ends with dp = {i: [] for i in range(n+1)}
# scan through s to get dp[i] for i in range(n+1): for j in range(max(i-maxlen, 0), i): if s[j:i] in wordd: dp[i].append(wordd[s[j:i]]) memo = {} # Recursively built the result from the tail of s, we can save a lot of space&time comparing to bottom-up # by not needing to calculate the temp results that are not in the paths of the final result defbuildResult(i): if i == 0: return [''] else: if i in memo: return memo[i] ret = [] for idx in dp[i]: ret += [tmp + ' '*int(bool(tmp)) + wordDict[idx] for tmp in buildResult(i - len(wordDict[idx]))] memo[i] = ret return ret
return buildResult(n)
Okey. What’s next?
Think about that Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words. A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.
Explanation: "catsdogcats" can be concatenated by "cats", "dog" and "cats"; "dogcatsdog" can be concatenated by "dog", "cats" and "dog"; "ratcatdogcat" can be concatenated by "rat", "cat", "dog" and "cat".
If you just use the aforementioned algorithms, it will be very time-consuming. Because the word dict could be very large and Only a little fraction of them can bu concatenated.
I saw some posts, and some brillion ideas are very provoking.
class : deffindAllConcatenatedWordsInADict(self, words): """ :type words: List[str] :rtype: List[str] """ defbrk(arr, st, wordDict, m): if st in m: return m[st] if st == len(arr): m[st] = True returnTrue for en in range(st+1, len(arr) + 1): if arr[st:en] in wordDict and brk(arr, en, wordDict, m): m[en] = True returnTrue m[st] = False returnFalse wordDict = set(words) res = [] for word in words: wordDict.remove(word) if brk(word, 0, wordDict, {}) and len(word): res.append(word) wordDict.add(word) return res
近期评论