Skip to content

Commit e0b5393

Browse files
travisjneumanclaude
andcommitted
refactor: split 7 long concept docs into sub-pages for readability
Each original file (functools-and-itertools, testing-strategies, modern-python-tooling, debugging-methodology, git-basics, collections-deep-dive, generators-and-iterators) is now a table of contents linking two sub-pages. Sub-pages use {slug}-part1/part2 naming, include prev/next navigation, and stay under 200 lines. Modality hubs, Practice, and Further Reading sections preserved in parent files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 631bf31 commit e0b5393

21 files changed

+2128
-1926
lines changed
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Collections Deep Dive — Part 1: defaultdict, Counter, OrderedDict
2+
3+
[← Back to Overview](./collections-deep-dive.md) · [Part 2: deque, namedtuple, ChainMap →](./collections-deep-dive-part2.md)
4+
5+
---
6+
7+
Python's `collections` module provides specialized container types that go beyond the built-in `list`, `dict`, and `set`. This part covers the three dict-like types: `Counter`, `defaultdict`, and `OrderedDict`.
8+
9+
## Why This Matters
10+
11+
Every program needs to store and organize data. The built-in types handle most cases, but they have gaps. Need to count how often each word appears? `Counter`. Need a dict that automatically handles missing keys? `defaultdict`. Learning these tools saves you from writing (and debugging) boilerplate code.
12+
13+
## `Counter` — count things
14+
15+
The most intuitive way to count occurrences:
16+
17+
```python
18+
from collections import Counter
19+
20+
# Count letters in a string:
21+
letter_counts = Counter("mississippi")
22+
print(letter_counts)
23+
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})
24+
25+
# Count words in a list:
26+
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
27+
word_counts = Counter(words)
28+
print(word_counts)
29+
# Counter({'apple': 3, 'banana': 2, 'cherry': 1})
30+
31+
# Most common items:
32+
word_counts.most_common(2)
33+
# [('apple', 3), ('banana', 2)]
34+
```
35+
36+
Counter supports math operations:
37+
38+
```python
39+
a = Counter("aabbcc")
40+
b = Counter("aabbd")
41+
42+
a + b # Counter({'a': 4, 'b': 4, 'c': 2, 'd': 1})
43+
a - b # Counter({'c': 2}) — only positive counts
44+
a & b # Counter({'a': 2, 'b': 2}) — minimum of each
45+
a | b # Counter({'a': 2, 'b': 2, 'c': 2, 'd': 1}) — maximum of each
46+
```
47+
48+
## `defaultdict` — dicts with automatic defaults
49+
50+
A `defaultdict` never raises `KeyError` — it creates a default value automatically for missing keys:
51+
52+
```python
53+
from collections import defaultdict
54+
55+
# Group items by category:
56+
animals = [("cat", "Felix"), ("dog", "Rex"), ("cat", "Whiskers"), ("dog", "Buddy")]
57+
58+
groups = defaultdict(list) # Missing keys get an empty list
59+
for category, name in animals:
60+
groups[category].append(name)
61+
62+
print(groups)
63+
# defaultdict(<class 'list'>, {'cat': ['Felix', 'Whiskers'], 'dog': ['Rex', 'Buddy']})
64+
```
65+
66+
Compare with regular dict:
67+
```python
68+
# Without defaultdict — verbose:
69+
groups = {}
70+
for category, name in animals:
71+
if category not in groups:
72+
groups[category] = []
73+
groups[category].append(name)
74+
75+
# With defaultdict — clean:
76+
groups = defaultdict(list)
77+
for category, name in animals:
78+
groups[category].append(name)
79+
```
80+
81+
Common default factories:
82+
```python
83+
defaultdict(list) # Missing keys → empty list []
84+
defaultdict(int) # Missing keys → 0
85+
defaultdict(set) # Missing keys → empty set set()
86+
defaultdict(str) # Missing keys → empty string ""
87+
defaultdict(dict) # Missing keys → empty dict {}
88+
```
89+
90+
Counting with `defaultdict(int)`:
91+
```python
92+
word_count = defaultdict(int)
93+
for word in "the cat sat on the mat".split():
94+
word_count[word] += 1
95+
# {'the': 2, 'cat': 1, 'sat': 1, 'on': 1, 'mat': 1}
96+
```
97+
98+
## `OrderedDict` — dict that remembers insertion order
99+
100+
Since Python 3.7, regular dicts maintain insertion order. So when is `OrderedDict` still useful?
101+
102+
```python
103+
from collections import OrderedDict
104+
105+
# OrderedDict considers order in equality checks:
106+
d1 = OrderedDict([("a", 1), ("b", 2)])
107+
d2 = OrderedDict([("b", 2), ("a", 1)])
108+
d1 == d2 # False — different order!
109+
110+
# Regular dicts do not:
111+
{"a": 1, "b": 2} == {"b": 2, "a": 1} # True
112+
113+
# OrderedDict has move_to_end:
114+
od = OrderedDict([("a", 1), ("b", 2), ("c", 3)])
115+
od.move_to_end("a") # a moves to end: OrderedDict([('b', 2), ('c', 3), ('a', 1)])
116+
od.move_to_end("c", last=False) # c moves to start: OrderedDict([('c', 3), ('b', 2), ('a', 1)])
117+
```
118+
119+
Use `OrderedDict` when order matters for equality comparison or when you need `move_to_end()`. Otherwise, use a regular dict.
120+
121+
## Common Mistakes
122+
123+
**Forgetting that defaultdict creates entries on access:**
124+
```python
125+
d = defaultdict(list)
126+
if d["missing_key"]: # This CREATES the key with an empty list!
127+
pass
128+
129+
# Use "key in d" to check without creating:
130+
if "missing_key" in d:
131+
pass
132+
```
133+
134+
**Using Counter with non-hashable items:**
135+
```python
136+
# Lists are not hashable:
137+
Counter([[1, 2], [3, 4]]) # TypeError!
138+
# Convert to tuples first:
139+
Counter([(1, 2), (3, 4)]) # OK
140+
```
141+
142+
---
143+
144+
| [← Overview](./collections-deep-dive.md) | [Part 2: deque, namedtuple, ChainMap →](./collections-deep-dive-part2.md) |
145+
|:---|---:|
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Collections Deep Dive — Part 2: deque, namedtuple, ChainMap
2+
3+
[← Part 1: defaultdict, Counter, OrderedDict](./collections-deep-dive-part1.md) · [Back to Overview](./collections-deep-dive.md)
4+
5+
---
6+
7+
This part covers the remaining `collections` types: `deque` for fast double-ended operations, `namedtuple` for lightweight immutable records, and `ChainMap` for layered dict lookups.
8+
9+
## `deque` — double-ended queue
10+
11+
A `deque` (pronounced "deck") is like a list, but optimized for adding and removing items from both ends:
12+
13+
```python
14+
from collections import deque
15+
16+
# Create a deque:
17+
d = deque([1, 2, 3])
18+
19+
# Add to both ends (O(1) — fast):
20+
d.append(4) # [1, 2, 3, 4]
21+
d.appendleft(0) # [0, 1, 2, 3, 4]
22+
23+
# Remove from both ends (O(1) — fast):
24+
d.pop() # 4, deque is [0, 1, 2, 3]
25+
d.popleft() # 0, deque is [1, 2, 3]
26+
27+
# Rotate:
28+
d = deque([1, 2, 3, 4, 5])
29+
d.rotate(2) # [4, 5, 1, 2, 3] — rotate right
30+
d.rotate(-2) # [1, 2, 3, 4, 5] — rotate left
31+
```
32+
33+
`list.insert(0, x)` is O(n) because it shifts every element. `deque.appendleft(x)` is O(1). Use a deque when you need to frequently add or remove from the front.
34+
35+
**Fixed-size buffer:**
36+
```python
37+
# Keep only the last 5 items:
38+
recent = deque(maxlen=5)
39+
for i in range(10):
40+
recent.append(i)
41+
print(recent) # deque([5, 6, 7, 8, 9], maxlen=5)
42+
```
43+
44+
## `namedtuple` — lightweight immutable objects
45+
46+
A `namedtuple` is like a tuple, but with named fields. Great for simple data containers:
47+
48+
```python
49+
from collections import namedtuple
50+
51+
# Define a type:
52+
Point = namedtuple("Point", ["x", "y"])
53+
54+
# Create instances:
55+
p = Point(3, 4)
56+
print(p.x) # 3
57+
print(p.y) # 4
58+
print(p) # Point(x=3, y=4)
59+
60+
# Still works like a tuple:
61+
x, y = p # Unpacking
62+
print(p[0]) # 3 (indexing)
63+
```
64+
65+
Real-world example:
66+
67+
```python
68+
User = namedtuple("User", ["name", "email", "role"])
69+
70+
alice = User("Alice", "alice@example.com", "admin")
71+
bob = User("Bob", "bob@example.com", "user")
72+
73+
print(alice.name) # "Alice"
74+
print(bob.role) # "user"
75+
76+
# Immutable — you cannot change fields:
77+
alice.name = "Alicia" # AttributeError!
78+
79+
# Create a modified copy with _replace:
80+
alicia = alice._replace(name="Alicia")
81+
```
82+
83+
For mutable named fields or more features, use `dataclasses` instead. See [Dataclasses Explained](./dataclasses-explained.md).
84+
85+
## `ChainMap` — search multiple dicts as one
86+
87+
A `ChainMap` groups multiple dictionaries together. Lookups search each dict in order until the key is found:
88+
89+
```python
90+
from collections import ChainMap
91+
92+
defaults = {"color": "blue", "size": "medium", "font": "Arial"}
93+
user_prefs = {"color": "red"}
94+
cli_args = {"size": "large"}
95+
96+
# Search CLI args first, then user prefs, then defaults:
97+
config = ChainMap(cli_args, user_prefs, defaults)
98+
99+
print(config["size"]) # "large" (from cli_args)
100+
print(config["color"]) # "red" (from user_prefs)
101+
print(config["font"]) # "Arial" (from defaults)
102+
```
103+
104+
This is useful for configuration systems where you have multiple layers of settings (defaults, user config, command-line overrides).
105+
106+
## Quick reference
107+
108+
| Type | Use when you need... | Example |
109+
|------|---------------------|---------|
110+
| `Counter` | Count occurrences | Word frequency, vote tallying |
111+
| `defaultdict` | Dict with automatic defaults | Grouping, counting, nested dicts |
112+
| `namedtuple` | Immutable record with named fields | Coordinates, database rows, config |
113+
| `deque` | Fast append/pop from both ends | Queues, buffers, sliding windows |
114+
| `OrderedDict` | Order-sensitive equality | Config where order matters |
115+
| `ChainMap` | Layered dict lookups | Multi-level configuration |
116+
117+
## Common Mistakes
118+
119+
**Mutating a namedtuple (you cannot):**
120+
```python
121+
Point = namedtuple("Point", ["x", "y"])
122+
p = Point(1, 2)
123+
p.x = 3 # AttributeError!
124+
# Use p._replace(x=3) to create a new instance
125+
```
126+
127+
---
128+
129+
| [← Part 1: defaultdict, Counter, OrderedDict](./collections-deep-dive-part1.md) | [Overview](./collections-deep-dive.md) |
130+
|:---|---:|

0 commit comments

Comments
 (0)