📖 8 min read
Last updated on

Word Break — Coding Interview Walkthrough


Hero Image for Word Break

You’re in a coding interview. The interviewer gives you a string and a dictionary and asks if the string can be segmented into dictionary words. You write the recursive solution in three minutes. It handles the examples perfectly. Then the interviewer asks: “What happens with s = 'aaaaaab'?” Your solution hangs. You’ve just demonstrated that you can write correct code that’s exponentially slow, and now you need to explain why and fix it on the spot.

This walkthrough covers how to avoid that trap: why the naive recursion explodes, how to recognize the overlapping subproblem structure, and how to convert it to a clean O(n^2) DP solution that interviewers want to see.


The Problem

Given a string s and a list of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words. The same word may be reused multiple times.

(LeetCode #139)

Example 1

Input s = "leetcode", wordDict = ["leet", "code"]

Output true

Example 2

Input s = "applepenapple", wordDict = ["apple", "pen"]

Output true

Example 3

Input s = "catsandog", wordDict = ["cats", "dog", "sand", "an", "cat"]

Output false

In Example 1, "leetcode" splits into "leet" + "code". In Example 2, "applepenapple" splits into "apple" + "pen" + "apple", reusing the same word. Example 3 is the instructive one: many partial matches lead deep into the string before hitting dead ends.

Already comfortable with the solution? Practice it in a mock interview →


What Interviewers Are Actually Testing

Word Break tests whether you can recognize a recursive problem with overlapping subproblems and convert it into an efficient solution.

SkillWhat Interviewers Observe
Subproblem identificationCan you define dp[i] clearly and explain what it represents?
Recurrence reasoningDo you correctly express how dp[i] depends on earlier states?
Memoization vs tabulationCan you explain both approaches and their tradeoffs?
Optimization awarenessDo you convert wordDict to a set for O(1) lookups?
Backtracking recognitionDo you recognize the naive recursive approach and explain why it’s too slow?

Step 1: Clarifying Questions

1. Can the same word be used multiple times? Yes, the problem explicitly says so. Confirming this prevents you from writing a solution that marks words as “consumed.”

2. Can wordDict contain duplicates? Potentially, but converting to a set handles this while giving O(1) lookup.

3. Is s guaranteed to be non-empty? Yes (at least length 1 on LeetCode). If empty, it would be vacuously segmentable.

4. Are the words in wordDict guaranteed to be non-empty? Yes. Empty strings would trivially match everywhere.

5. What are the size constraints? s up to 300 characters, wordDict up to 1,000 words each up to 20 characters. O(n^2) is acceptable.


Step 2: A First (Non-Optimal) Idea

The natural first approach is pure recursion: try every prefix, and if it matches a dictionary word, recursively check whether the rest is breakable.

def word_break_naive(s: str, wordDict: list[str]) -> bool:
    word_set = set(wordDict)

    def can_break(start: int) -> bool:
        if start == len(s):
            return True
        for end in range(start + 1, len(s) + 1):
            if s[start:end] in word_set and can_break(end):
                return True
        return False

    return can_break(0)

This is correct but runs in O(2^n) time in the worst case. Consider s = "aaaaab" with wordDict = ["a", "aa", "aaa"]. Every prefix matches, so the recursion branches at every position, and since "b" never matches, every branch fails. The same subproblem can_break(i) gets recomputed exponentially many times.


Step 3: The Key Insight

Whether the suffix starting at index i is breakable depends only on i, not on the path taken to get there. That makes can_break(i) a pure function of its input: a perfect candidate for memoization. Since there are only n + 1 possible values of i, there are at most n + 1 unique subproblems, each computed once.

Reframing as bottom-up DP: define dp[i] as True if the substring s[0:i] can be segmented using the dictionary.

  • dp[0] = True: the empty prefix is always breakable (base case).
  • For each i from 1 to n: dp[i] is True if there exists some j < i such that dp[j] is True and s[j:i] is in the dictionary.

In plain English: position i is reachable if there’s some earlier reachable position j from which a single dictionary word gets you to i.

DP boolean array for 'leetcode' with dict=['leet','code']: T at indices 0, 4, 8 (highlighted). dp[8]=True means the string can be segmented. ---

Step 4: Optimal Strategy

  1. Convert wordDict to a set for O(1) lookup.
  2. Initialize a boolean DP array dp of length n + 1, all False. Set dp[0] = True.
  3. For each index i from 1 to n:
    • For each index j from 0 to i - 1:
      • If dp[j] is True and s[j:i] is in the word set, set dp[i] = True and break.
  4. Return dp[n].

Python Implementation

def word_break(s: str, wordDict: list[str]) -> bool:
    word_set = set(wordDict)  # O(1) lookup instead of O(m) list scan
    n = len(s)

    # dp[i] = True means s[0:i] can be segmented using wordDict
    dp = [False] * (n + 1)
    dp[0] = True  # Empty prefix is always valid

    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # No need to check other j values for this i

    return dp[n]

Let’s trace s = "leetcode", wordDict = ["leet", "code"]:

isubstring checkedkey checkdp[i]
0""base caseTrue
1-3"l", "le", "lee"no matchesFalse
4"leet"dp[0]=True, "leet" in setTrue
5-7"leetc", "leetco", "leetcod"no word completes from dp[4]False
8"leetcode"dp[4]=True, "code" in setTrue

dp[8] = True. Return True.

Memoized Recursive Version

If you prefer top-down thinking:

def word_break_memo(s: str, wordDict: list[str]) -> bool:
    word_set = set(wordDict)
    memo = {}

    def can_break(start: int) -> bool:
        if start == len(s):
            return True
        if start in memo:
            return memo[start]

        for end in range(start + 1, len(s) + 1):
            if s[start:end] in word_set and can_break(end):
                memo[start] = True
                return True

        memo[start] = False
        return False

    return can_break(0)

Both versions are O(n^2 x m) time and O(n) space.


Time & Space Complexity

AspectComplexityExplanation
TimeO(n^2 x m)n^2 substrings checked, each hash comparison is O(m) for word length m
Space (DP array)O(n)One boolean per position in s
Space (word set)O(W x m)W words of average length m stored in the set

In practice, you can tighten the inner loop: for each position i, only check substrings of length up to max_word_len. This reduces the constant factor meaningfully.


Common Interview Mistakes

  1. Using a list for wordDict lookups instead of a set. s[j:i] in wordDict where wordDict is a list costs O(W) per lookup. Converting to a set drops this to O(1) amortized.

  2. Initializing dp[0] = False. dp[0] represents the empty prefix, which can always be segmented. Setting it to False means no position is ever reachable, and the entire table stays False.

  3. Off-by-one in the DP indices. dp has length n + 1. dp[i] represents s[0:i]. The answer is dp[n], not dp[n-1].

  4. Forgetting the break after setting dp[i] = True. Not breaking doesn’t produce wrong answers, but it wastes work and signals you haven’t thought about the inner loop’s exit condition.

  5. Proposing pure recursion without flagging its time complexity. The recursive approach is a natural starting point, but you should immediately explain why it’s O(2^n) and pivot to DP. Presenting naive recursion as your final answer is a red flag.

  6. Not validating the recurrence on a small example. The index arithmetic (dp[j], s[j:i], dp[n]) is easy to mess up. Tracing through "leet" + "code" takes 90 seconds and catches every off-by-one.


What a Strong Candidate Sounds Like

“My first instinct is a recursive solution: try every prefix, and if it’s in the dictionary, recurse on the suffix. That’s correct but O(2^n) because we recompute the same suffix breakability over and over. The fix is to recognize that whether position i is reachable depends only on i, not on the path we took. So I’ll use a DP array where dp[i] means ‘s[0:i] is segmentable’. The recurrence is: dp[i] is True if there’s any j < i where dp[j] is True and s[j:i] is a dictionary word. Base case is dp[0] = True. I’ll convert the dictionary to a set first for O(1) lookups. Let me trace through the first example to verify the indices before I code.”


Example Interview Dialogue

Interviewer: Why is the naive recursive solution exponential?

Candidate: Because it recomputes the same subproblems repeatedly. Take s = "aaab" with dictionary ["a", "aa"]. When we try "a" as the first word, we recurse on "aab". When we try "aa", we recurse on "ab". Later, through different paths, we’ll recurse on "aab" and "ab" again, multiple times each. The number of redundant calls grows exponentially. Memoizing collapses this to O(n^2).

Interviewer: Can you optimize it further using the maximum word length?

Candidate: Yes. Instead of checking all j from 0 to i-1, I only need to check j values where i - j is at most the length of the longest word. If no word is longer than 10 characters, then for each i I only check the last 10 positions. This doesn’t improve the asymptotic worst case, but it’s a meaningful constant-factor speedup.

Interviewer: How would you solve Word Break II?

Candidate: Word Break II asks for all valid segmentations, not just whether one exists. I’d run the DP first to confirm breakability and identify reachable positions, then use backtracking from position n backwards, following only reachable transitions. The time complexity is O(n^2 + output size), where output size can be exponential, but the DP pass ensures we don’t waste time on dead-end paths.


Word Break sits at the intersection of string problems and dynamic programming. These problems all share the “look back at previous states” recurrence:

  • Word Break II (LeetCode #140). Returns all valid sentences. Extends to backtracking over DP results.
  • Coin Change (LeetCode #322). Structurally identical DP: can you reach a target using unlimited reuse of values?
  • Longest Increasing Subsequence (LeetCode #300). Classic 1D DP with a similar “look back” recurrence.
  • Concatenated Words (LeetCode #472). Applies Word Break logic to check if each word can be built from other words.
  • Palindrome Partitioning (LeetCode #131). Another string segmentation problem combining DP with backtracking.

Practice This in a Mock Interview

Reading this explanation gives you the recurrence, the base case, and the index arithmetic. But DP problems have a particular failure mode under pressure: the logic seems clear until you’re staring at a blank whiteboard, and suddenly you can’t remember whether the answer is dp[n] or dp[n-1], or whether the inner loop should go from 0 to i or 0 to i-1. These aren’t conceptual gaps: they’re execution gaps that only close through repetition.

The best way to build confidence with DP is to practice deriving the recurrence out loud, tracing through a small example by hand, and then coding it, in that order, every time.

Start a mock interview for Word Break on Intervu.

Further reading:

Practice Like It's the Real Interview

Get instant feedback on your approach, communication, and code — powered by AI.

Start a Mock Interview →