📖 9 min read
Last updated on

Time Based Key-Value Store — Coding Interview Walkthrough


You’re in a coding interview. The interviewer says: “Design a key-value store that supports timestamps. Values are always set in increasing timestamp order. For a get, return the value with the largest timestamp less than or equal to the query timestamp.” You’ve seen Time Based Key-Value Store before. You know it’s binary search under the hood. But now you need to design the data structure on a whiteboard, explain why binary search applies here, handle the floor-query indexing correctly, and talk through every edge case while someone watches.

This walkthrough covers the full interview arc: from data structure choice, through clarifying questions, to the binary search floor pattern that makes retrieval O(log n). We’ll break down exactly how a strong candidate thinks through this problem and what mistakes to avoid.


The Problem

Implement a TimeMap class with two operations:

  • set(key: str, value: str, timestamp: int) stores the key-value pair at the given timestamp.
  • get(key: str, timestamp: int) -> str returns the value with the largest timestamp ≤ the given timestamp, or "" if none exists.

The problem guarantees that set is always called with strictly increasing timestamp values for a given key. (LeetCode #981)

Example

timeMap.set("foo", "bar", 1)
timeMap.get("foo", 1)   # returns "bar"
timeMap.get("foo", 3)   # returns "bar" (largest ts ≤ 3 is 1)
timeMap.set("foo", "bar2", 4)
timeMap.get("foo", 4)   # returns "bar2"
timeMap.get("foo", 5)   # returns "bar2"

The key observation: get("foo", 3) doesn’t find an exact match at timestamp 3, so it returns the value from the most recent timestamp before 3, which is timestamp 1. This “floor query” behavior is the core challenge.

Already comfortable with the solution? Practice it in a mock interview →


What Interviewers Are Actually Testing

Time Based Key-Value Store combines data structure design with binary search. Interviewers use it to see whether you can select the right data structures, recognize that sorted input enables binary search, and handle the floor-query indexing without off-by-one errors.

SkillWhat Interviewers Observe
Data structure designDo you choose a dict of sorted lists? Can you justify it?
Binary search applicationDo you recognize that sorted timestamps enable O(log n) retrieval?
Floor query mechanicsDo you correctly return the value at bisect_right - 1?
Edge case handlingKey not found? Timestamp smaller than all stored values?
API design clarityDo you structure the class cleanly with clear method signatures?

Step 1: Clarifying Questions

Before writing any code, ask questions that demonstrate you understand the problem’s structure:

  1. Are timestamps for a given key always strictly increasing? Yes. This is the critical constraint: it means each key’s value list is already sorted by timestamp without any extra work. Confirming this tells the interviewer you’ve identified the property that makes binary search possible.

  2. Can different keys share the same timestamp? Yes. Each key has its own independent list of timestamps. This confirms you need a per-key data structure, not a single global sorted list.

  3. What should get return if the key doesn’t exist? Return "". This is a simple edge case, but mentioning it shows you’re thinking about boundary conditions before coding.

  4. What should get return if all stored timestamps are greater than the query timestamp? Return "". This is the floor-query edge case: bisect_right returns 0, meaning no valid timestamp exists at or before the query.

  5. Can set be called with the same timestamp twice for the same key? No, timestamps are strictly increasing. This simplifies the problem since we don’t need to handle overwrites.


Step 2: A First (Non-Optimal) Idea

The simplest approach: store each key’s values in a list and, for every get, scan the list linearly to find the largest timestamp ≤ the query.

class TimeMap:
    def __init__(self):
        self.store = {}

    def set(self, key, value, timestamp):
        if key not in self.store:
            self.store[key] = []
        self.store[key].append((timestamp, value))

    def get(self, key, timestamp):
        if key not in self.store:
            return ""
        result = ""
        for ts, val in self.store[key]:
            if ts <= timestamp:
                result = val  # Keep updating; last valid is the largest
            else:
                break  # Since sorted, no need to check further
        return result

This works because timestamps are sorted, so we can break early. But the worst case is still O(n) per get when the query timestamp is larger than all stored timestamps.

Approachset Timeget TimeSpace
Linear scanO(1)O(n)O(n)
Binary searchO(1)O(log n)O(n)

For a key with thousands of entries, the difference between O(n) and O(log n) per query is significant. The sorted-input guarantee is a strong hint that binary search should replace the linear scan.


Step 3: The Key Insight

Since set is always called with increasing timestamps, each key’s value list is already sorted. get is a floor query: find the rightmost timestamp ≤ the query. This is exactly bisect_right(timestamps, ts) - 1.

Here’s why bisect_right and not bisect_left:

  • bisect_right(timestamps, ts) returns the insertion point after any existing entry equal to ts.
  • Subtracting 1 gives the index of the last entry with timestamp ≤ ts.
  • If ts matches an existing timestamp exactly, bisect_right places us after the match, so - 1 lands on the exact match.
  • If ts falls between two timestamps, bisect_right gives the same result as bisect_left, and - 1 gives the largest timestamp below ts.

The verbal explanation matters: “I need the largest timestamp that doesn’t exceed my query. bisect_right gives the insertion point after any exact match, so subtracting one always lands me on the correct floor value.”

Sorted timestamps [t=1, t=3, t=5, t=9] with binary search finding the latest value ≤ t=6. Result: v3 highlighted. ---

Step 4: Optimal Strategy

  1. Use a defaultdict(list) mapping each key to a list of (timestamp, value) tuples. Since set always uses increasing timestamps, each list stays sorted automatically.
  2. For set, simply append the (timestamp, value) pair. O(1).
  3. For get, extract the timestamps from the list and run bisect_right(timestamps, query_ts).
  4. Subtract 1 from the result. If the index is negative (i.e., bisect_right returned 0), all stored timestamps exceed the query, so return "".
  5. Otherwise, return the value at that index.

One optimization: instead of extracting timestamps into a separate list on every get call (which costs O(n)), maintain two parallel lists per key, one for timestamps and one for values. This keeps get at a clean O(log n) with no allocation overhead.


Python Implementation

from collections import defaultdict
from bisect import bisect_right

class TimeMap:
    def __init__(self):
        # Parallel lists: timestamps and values for each key
        self.timestamps = defaultdict(list)
        self.values = defaultdict(list)

    def set(self, key: str, value: str, timestamp: int) -> None:
        self.timestamps[key].append(timestamp)
        self.values[key].append(value)

    def get(self, key: str, timestamp: int) -> str:
        ts_list = self.timestamps[key]
        if not ts_list:
            return ""

        # bisect_right gives insertion point after any exact match
        idx = bisect_right(ts_list, timestamp) - 1

        if idx < 0:
            return ""  # All stored timestamps exceed query
        return self.values[key][idx]

Why this version is interview-friendly:

  • Parallel lists avoid per-query allocation. The draft version built timestamps = [t for t, _ in pairs] inside every get call, which is O(n) memory allocation per query. Maintaining self.timestamps and self.values separately keeps get at pure O(log n).
  • defaultdict(list) avoids key-existence checks. Accessing a missing key returns an empty list, so if not ts_list handles the “key not found” case cleanly.
  • bisect_right - 1 is the standard floor pattern. Naming this explicitly in an interview shows familiarity with the binary search library.

Time & Space Complexity

OperationTimeSpace
setO(1) amortizedO(1) per call
getO(log n)O(1)
Overall spaceO(total set calls)

n = number of set calls for a given key. set does a list append (O(1) amortized). get runs binary search on the timestamps list (O(log n)) with no additional allocation.

A common follow-up: “What if we needed to support delete?” Removing from the middle of a list is O(n). You’d need a different structure, like a sorted container or a balanced BST, to keep all operations logarithmic. Mentioning this tradeoff shows design maturity.


Common Interview Mistakes

  1. Linear scan instead of binary search. Iterating through all timestamps for a key makes get O(n). The sorted-input constraint is a direct signal to use binary search. Missing this tells the interviewer you didn’t read the constraints carefully.

  2. Rebuilding the timestamps list on every get call. Writing timestamps = [t for t, _ in pairs] inside get allocates O(n) memory per query. The binary search itself is O(log n), but the list comprehension dominates. Maintain timestamps in a separate persistent list.

  3. Off-by-one: using bisect_left instead of bisect_right. Both can work, but the adjustment differs. With bisect_right, subtract 1 to get the floor. With bisect_left, you’d need to check whether the element at the returned index matches the query before deciding to subtract. bisect_right - 1 is cleaner and less error-prone.

  4. Not handling idx < 0. When the query timestamp is smaller than all stored timestamps, bisect_right returns 0, and 0 - 1 = -1. In Python, list[-1] returns the last element instead of raising an error, silently returning the wrong value. Always check idx < 0 explicitly.

  5. Using a single tuple list instead of parallel lists. Storing (timestamp, value) tuples is conceptually clean, but bisect_right operates on a simple list of comparable values. You’d need to extract timestamps into a separate list for each query (O(n)) or use a custom key function. Parallel lists solve this elegantly.


What a Strong Candidate Sounds Like

“I’ll use two parallel lists per key: one for timestamps and one for values. Since set guarantees strictly increasing timestamps, the timestamp list stays sorted without any extra work. For get, I run bisect_right on the timestamp list to find where the query timestamp would be inserted, then subtract 1 to get the largest timestamp at or below the query. If the index goes negative, no valid entry exists and I return an empty string. set is O(1), get is O(log n).”

This works because the candidate states the data structure choice, explains why it stays sorted, identifies the operation as a floor query, gives the exact algorithm, handles the edge case, and states the complexity, all in under 30 seconds.


Example Interview Dialogue

Interviewer: Design a key-value store where each key can have values at different timestamps. set always uses increasing timestamps. get should return the value at the largest timestamp at or before the query.

Candidate: First, a couple of clarifying questions. Are timestamps strictly increasing per key, or globally? And what should I return if no timestamp is valid for the query?

Interviewer: Strictly increasing per key. Return an empty string if nothing matches.

Candidate: Good. Since timestamps per key are always increasing, each key’s list is naturally sorted. That means get is a floor query, and I can use binary search. I’ll store two parallel lists per key: timestamps and values. For set, I just append to both. For get, I run bisect_right on the timestamps list, subtract 1, and check if the index is valid.

Interviewer: Why bisect_right and not bisect_left?

Candidate: bisect_right places the insertion point after any exact match. So subtracting 1 always lands on the correct floor value, whether it’s an exact match or the largest timestamp below the query. With bisect_left, the insertion point is before an exact match, so I’d need an extra check to handle the exact-match case. bisect_right - 1 is one line with no branching.

Interviewer: What’s the complexity?

Candidate: set is O(1) amortized for the list append. get is O(log n) where n is the number of entries for that key. Space is O(total entries) across all keys.

Interviewer: What if timestamps weren’t guaranteed to be increasing?

Candidate: Then I’d need to maintain a sorted data structure explicitly. I could use bisect.insort for O(n) insertion, or switch to a balanced BST if I needed O(log n) for both set and get. Python’s standard library doesn’t have a built-in balanced BST, but the sortedcontainers package provides SortedList which would work.


Time Based Key-Value Store combines data structure design with the binary search floor pattern. These problems build on the same foundations:

  • Binary Search (LeetCode #704). The foundational binary search problem. Master this first.
  • Search in Rotated Sorted Array (LeetCode #33). Binary search when the sorted property is broken by rotation.
  • First Bad Version (LeetCode #278). Binary search on a boolean predicate, a close cousin of the floor query pattern.
  • Design HashMap (LeetCode #706). A simpler design problem that tests the same class-based API design skills.
  • Snapshot Array (LeetCode #1146). Another timestamp-aware data structure that uses binary search for historical lookups.

Practice This in a Mock Interview

Time Based Key-Value Store looks deceptively simple. The class structure is straightforward, and binary search is a well-known technique. The difficulty is in the details: choosing the right bisect variant, avoiding the per-query allocation trap, and explaining the floor-query pattern clearly under pressure. These are exactly the kinds of issues that surface when you’re implementing live with someone watching, not when you’re reading a solution at your desk.

Start a mock interview for Time Based Key-Value Store on Intervu.

Further reading:

Practice Like It's the Real Interview

Get instant feedback on your approach, communication, and code — powered by AI.

Start a Mock Interview →