Skip to content

Latest commit

 

History

History
246 lines (210 loc) · 6.67 KB

File metadata and controls

246 lines (210 loc) · 6.67 KB

Key Insights

To break down the word, we try to slice each position, then we will have two substrings s[0:i] and s[i:] for each slice:

// All possible slices:
s =  0, 1, 2, 3, 4
   | 0, 1, 2, 3, 4
     0| 1, 2, 3, 4
     0, 1| 2, 3, 4
     0, 1, 2| 3, 4
     0, 1, 2, 3| 4
     0, 1, 2, 3, 4| 

// We can iterate for each slice:
     0, 1, 2, 3, 4
s = [0.....i.....n)
  i, ............n  // ""     + s[i:n), base case
     i, .........n  // s[0:0] + s[0:]
     0, i, ......n  // s[0:1] + s[1:]
     0, .. i, .. n  // s[0:2] + s[2:]
     0, ..... i, n  // ...
     0           i  // s[0:] + "", original string

For example, for word "car", we can slice:

// All possible slices:
s = "c, a, r"
  |, c, a, r  // base case
     c| a, r
     c, a| r
     c, a, r| // original string

"car" = 
    "" + "car"
    "c" + "ar"
    "ca" + "r"
    "car" + "" // original string

Then we can break down the problem into smaller subproblems:

// For word "cars" and dictionary ["car", "ca", "rs"]
f(car) =
    f() + "car" in dict ||
    f(c) + "ar" in dict ||
    f(ca) + "r" in dict

// Subproblems:
f(ca) =
    f(c) + "a" in dict ||
    f() + "ca" in dict

f(c) =
    f() + "c" in dict

f() = true // Base case, empty string is always true.

Top-Down

  • State: canBreak(i) = can break prefix s[0:i)?
  • 針對原字串,嘗試所有分割,分割成 f(0..i) AND s[i:]
  • f(0..i) 就代表更小的字串,就是把原問題分割成更小的子問題,每個子問題都做一樣的事情(嘗試所有分割)

Bottom-Up

  • State: dp[i] = can break prefix s[0:i)? 代表長度為 i 字串的結果
  • 原題目求的是 dp[n],我們需要先求 dp[n - 1],而 dp[n - 1]又由 dp[n - 2] 而得,以此類推,代表我們要先從 dp[0] 開始往上算。
  • 所以外層需要先遍歷可能的 i,代表字串長度從小到大的結果。
  • 針對每個 i 我們就需要遍歷每個 j 代表在當前長度為 i 的字串每個可能不同的切割位置。

Tips:

  • Stick to end exclusive everywhere.

Top-Down DP

The top-down definition is f(s) = can the prefix s[0:i) be segmented? + s[i:] exists in dictionary?

If we can build s[0:i) from dictionary recursively and s[i:] exists in dictionary, then we can build s[0:i) + s[i:] from dictionary.

// Original problem:
f('leetcode') =
    (f('') && 'leetcode' in dict) || // substring(0 until 0) + substring(0)
    (f('l') && 'eetcode' in dict) ||
    (f('le') && 'etcode' in dict) ||
    (f('lee') && 'tcode' in dict) ||
    (f('leet') && 'code' in dict) ||
    (f('leetc') && 'ode' in dict) ||
    (f('leetco') && 'de' in dict) ||
    (f('leetcod') && 'e' in dict) ||

// Subproblems:
f('leetcod') = ...
f('leetco') = ...
...

// Base case:
f('') = true
fun wordBreak(s: String, wordDict: List<String>): Boolean {
    val dict = wordDict.toSet()
    val memo = HashMap<String, Boolean>()
    return canBreak(s, dict, memo)
}

private fun topDown(s: String, dict: Set<String>, memo: HashMap<String, Boolean>): Boolean {
    // Base case: empty string can always be segmented
    if (s.isEmpty()) return true 
    if (s in memo) return memo[s]!!
    
    val n = s.length
    // Try every possible split
    for (i in 0 until n) {
        val prefix = s.substring(0, i)  // 0 until i
        val suffix = s.substring(i)     // i until n
        if (topDown(prefix, dict, memo) && suffix in dict) {
            memo[s] = true
            return true
        }
    }
    memo[s] = false
    return false
}
  • Time Complexity: O(n^2) where n is the length of string.
    • There are O(n) substrings we have to try.
    • For each substring, we try O(n) splits.
  • Space Complexity: O(n) for memoization for each substring.
    • The memoization map stores O(n) entries (for each prefix).
    • Each recursive call uses O(n) stack space in the worst case (for the recursion depth).

Bottom-Up DP

In bottom-up approach, We build the solution from smaller substrings to the entire string.

Let's define dp[i] to be the state if we can word break the substring s[0:i), then the state transition will be:

// s = [0.....j.....i)
//      |-----|-----|
//     [0 ... j] 
//           [j ... i)
//     s[0:j] + s[j:i)

// The state of s[0:i] = s[0:j] + s[j:i) for all possible j
dp[i] = 
    dp[j]       // The state of s[0:j], 
    && s[j:i)   // The remaining substring, check if it in the dictionary

// We iterate j from 0 to i to build the dp[i]
dp[i] = 
    dp[0] && s[0:i) in dict ||
    dp[1] && s[1:i) in dict ||
    dp[2] && s[2:i) in dict ||
    ...
    dp[i - 1] && s[i - 1:i) in dict

To build dp[n], we must build dp[n - 1] first, then dp[n - 2], and so on until dp[0], so:

  • We iterate i from 0 .. len to build the dp[i] table from bottom up.
  • For each i, we have to slice each position, so we iterate j from 0 until i to check if dp[j] == true (can break) and s[j:i) in dictionary.

For example, "cars" and ["car", "ca", "rs"]:

s = 0, 1, 2, 3
       c  a  r
    i
// i = 0, empty string
dp[0] = true // Base case, empty string is in dictionary

// i = 1, dp[c]
s = 0, 1, 2, 3, 4
       c  a  r
       i
    j
dp[1] = dp[0] && s[0:1) in dict

// i = 2, dp[ca]
s = 0, 1, 2, 3, 4
       c  a  r
          i
    |--j
dp[2] = dp[0] && s[0:2) in dict ||
        dp[1] && s[1:2) in dict

s = 0, 1, 2, 3, 4
       c  a  r
             i
    |-----j
dp[3] = dp[0] && s[0:3) in dict ||
        dp[1] && s[1:3) in dict ||
        dp[2] && s[2:3) in dict

// i = n, dp[car]
s = 0, 1, 2, 3, 4
       c  a  r
                i
    |--------j
dp[n] = ... // The final result
fun wordBreak(s: String, wordDict: List<String>): Boolean {
    val dict = wordDict.toSet()
    // We define dp size as s.length + 1 because the base case is empty string, that is s[0:0] = "".
    val dp = BooleanArray(s.length + 1)
    // Base case is empty string, which is true.
    dp[0] = true
    // Try every possible split, from smaller length to original length.
    for (i in 1..s.length) {
        // dp[i] = dp[0:j] + s[j:i] in dict
        // dp[i] = prefix of subproblem + suffix in dict
        for (j in 0..i) {
            if (dp[j] && dict.contains(s.substring(j, i))) {
                dp[i] = true
            }
        }
    }
    return dp[s.length]
}
  • Time Complexity: O(n^2) where n is the length of string.
  • Space Complexity: O(n) for dp table.

Please note that we can't iterate to build substring start from i, because the dp[i] represent the state of s[0:i).

c a r s
-------
c
c a
c a r
c a r s
  a
  a r
  a r s
    r
    r s
      s