PATTERNS.md tells you which technique a problem wants. This page is one level lower: once you know the technique, which container do you put the data in so the operations you need are fast? Picking the wrong structure is what turns a clean O(n) idea into a TLE.
The whole skill comes down to three questions. Ask them in order:
- What operation must be fast? (lookup? min/max? insert at ends? sorted order?)
- Do I need ordering? (insertion order, sorted order, or none?)
- Do I need uniqueness or key→value mapping?
Your answers point at one structure almost every time. The tables below are just that logic, pre-computed.
The fastest way to choose: match the wording of the problem to a structure.
| The problem mentions… | Reach for | Why |
|---|---|---|
| "unique", "seen before", "contains duplicate" | Set (set) |
O(1) membership test |
| "count", "frequency", "most common", "anagram" | Hash map / Counter (Counter) |
O(1) tally per element |
| "key → value", "look up by id", "cache" | Hash map (dict) |
O(1) keyed access |
| "top K", "K largest/smallest", "median of a stream" | Heap (heapq) |
O(log n) to keep the best K |
| "next greater", "nearest smaller", "valid parentheses" | Stack | LIFO matches the access order |
| "sliding window max/min", "first negative in window" | Monotonic Deque (deque) |
O(1) push/pop at both ends |
| "process in arrival order", "level by level", "BFS" | Queue (deque) |
FIFO |
| "shortest path", "connected", "network", "grid" | Graph (adjacency list = dict[node] → list) |
models relationships |
| "starts with", "prefix", "autocomplete", "dictionary" | Trie | O(L) prefix lookup |
| "merge groups", "are these connected", "count islands by union" | Union-Find (DSU) | near-O(1) grouping |
| "keep sorted", "floor/ceil", "kth smallest with inserts" | Balanced BST / sorted list (SortedList) |
O(log n) ordered ops |
| "range sum/min with updates" | Fenwick / Segment Tree | O(log n) query + update |
| "LRU cache", "most-recently-used" | Hash map + Doubly Linked List | O(1) move-to-front |
The same idea, indexed by capability instead of wording.
| I need fast… | Best structure | Operation cost |
|---|---|---|
| Lookup / insert / delete by key | Hash map (dict) |
O(1) average |
| Membership test | Hash set (set) |
O(1) average |
Index access (a[i]) |
Dynamic array (list) |
O(1) |
| Insert / remove at both ends | deque |
O(1) |
| Insert / remove at one end (LIFO/FIFO) | Stack / Queue | O(1) |
| Repeated min or max retrieval | Heap (heapq) |
O(log n) push/pop, O(1) peek |
| Elements always in sorted order | Balanced BST / SortedList |
O(log n) |
| Prefix matching on strings | Trie | O(L) |
| Grouping / connectivity queries | Union-Find | ~O(1) amortized |
| Range queries with updates | Segment Tree / Fenwick | O(log n) |
Rule of thumb: if you find yourself scanning the whole collection (
for x in data) inside another loop just to find one element, you almost certainly want a hash map, a heap, or a sorted structure instead.
Since this repo is Python 3, here's the concrete mapping from "I want X" to the standard-library type.
| You want | Use | Note |
|---|---|---|
| Hash map | dict |
also collections.defaultdict for auto-default values |
| Hash set | set |
frozenset if you need it hashable |
| Frequency count | collections.Counter |
.most_common(k) is handy |
| Stack | list (append / pop) |
already O(1) |
| Queue / Deque | collections.deque |
popleft() is O(1) (a list is O(n)) |
| Min-heap | heapq |
for a max-heap, push -value |
| Sorted container | sortedcontainers.SortedList |
3rd-party; or bisect on a list for mostly-static data |
| LRU ordering | collections.OrderedDict |
.move_to_end(); or functools.lru_cache for memoization |
| Tuple as key | dict[(r, c)] |
great for grids and visited-states |
Keep this in your head — it's what the choice ultimately rests on.
| Structure | Access | Search | Insert | Delete |
|---|---|---|---|---|
Array / list |
O(1) | O(n) | O(1)* end / O(n) middle | O(n) |
| Hash map / set | — | O(1) avg | O(1) avg | O(1) avg |
Stack / Queue (deque) |
O(n) middle | O(n) | O(1) ends | O(1) ends |
Heap (heapq) |
O(1) peek | O(n) | O(log n) | O(log n) pop-min |
Balanced BST / SortedList |
O(log n) | O(log n) | O(log n) | O(log n) |
| Trie | — | O(L) | O(L) | O(L) |
| Union-Find | — | ~O(1) | ~O(1) | — |
| Linked List | O(n) | O(n) | O(1) at node | O(1) at node |
* amortized
Some classic problems aren't one structure — they're two working together. These pairings come up constantly:
- LRU Cache = Hash map (for O(1) lookup) + Doubly Linked List (for O(1) reordering).
- Median of a data stream = a max-heap for the lower half + a min-heap for the upper half (the "two heaps" trick).
- Dijkstra / weighted shortest path = Graph (adjacency list) + a min-heap of
(distance, node). - Sliding-window maximum = a monotonic deque that drops elements that can never be the max.
- Counting islands / friend circles = Grid traversal + Union-Find, or just BFS/DFS with a visited set.
- Detect a cycle in a linked list = the list + two pointers (slow/fast), no extra space.
Need key → value lookup? → Hash map
Just membership / uniqueness? → Hash set
Counting things? → Counter
Repeated min/max or top-K? → Heap
LIFO / FIFO / both ends? → Stack / Queue / Deque
Must stay in sorted order? → Balanced BST / SortedList
Prefixes / autocomplete? → Trie
Connectivity / grouping? → Union-Find
Relationships / paths? → Graph
Range queries with updates? → Segment Tree / Fenwick
Pair this with PATTERNS.md: patterns tell you the plan, data structures tell you the tools.