Skip to content

Commit ce4ff36

Browse files
travisjneumanclaude
andcommitted
feat: add step-by-step WALKTHROUGH.md for 12 key projects (levels 2-5)
Create pedagogical walkthroughs for 3 projects per level (first, mid, capstone) across levels 2-5. Each walkthrough guides learner thinking with structured steps, code snippets, "Predict" prompts, common mistakes tables, and key takeaways. Level 2 existing walkthroughs updated to match the consistent template format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 3644739 commit ce4ff36

File tree

27 files changed

+6252
-918
lines changed

27 files changed

+6252
-918
lines changed
Lines changed: 77 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,24 @@
1-
# Walkthrough: Dictionary Lookup Service
1+
# Dictionary Lookup Service — Step-by-Step Walkthrough
22

3-
> This guide walks through the **thinking process** for building this project.
4-
> It does NOT give you the complete solution. For that, see [SOLUTION.md](./SOLUTION.md).
3+
[<- Back to Project README](./README.md) · [Solution](./SOLUTION.md)
54

6-
## Before reading this
5+
## Before You Start
76

8-
**Try the project yourself first.** Spend at least 20 minutes.
9-
If you have not tried yet, close this file and open the [project README](./README.md).
7+
Read the [project README](./README.md) first. Try to solve it on your own before following this guide. Spend at least 20 minutes attempting it independently.
108

11-
---
9+
## Thinking Process
1210

13-
## Understanding the problem
11+
When you see "dictionary lookup service," your first question should be: where does the data come from, and what format is it in? The data lives in a text file with one `key=value` pair per line. So the first job is parsing that file into a Python dict. Think about what could go wrong at this stage -- definitions that contain `=` signs, duplicate keys, and inconsistent capitalization.
1412

15-
You need to build a dictionary lookup tool that loads `key=value` pairs from a file, looks up terms (case-insensitively), and suggests close matches when a term is not found. It also provides statistics about the dictionary. Think of it like a simplified spell-check-enabled glossary.
13+
Next, think about what happens when someone searches for a term that does not exist. You could just say "not found," but a better experience is to suggest close matches. This is where the `difflib` module comes in -- it can find strings that are similar to the search term. Think of it like a spell checker: you type "pythn" and it says "did you mean python?"
1614

17-
The dictionary file looks like:
15+
Finally, consider normalization. If the dictionary has "Python" and the user searches for "python," should that match? Almost certainly yes. Normalizing to lowercase at both load time and search time solves this cleanly.
1816

19-
```
20-
python=A high-level programming language
21-
javascript=A language for web development
22-
html=HyperText Markup Language
23-
```
24-
25-
## Planning before code
26-
27-
```mermaid
28-
flowchart TD
29-
A[Load dictionary file] --> B[load_dictionary: parse key=value lines into dict]
30-
B --> C{User action?}
31-
C -->|--lookup term| D[lookup: exact match or fuzzy suggestions]
32-
C -->|--stats| E[dictionary_stats: count entries, analyse keys]
33-
C -->|batch lookup| F[batch_lookup: look up many terms]
34-
D --> G[Print structured result]
35-
F --> G
36-
E --> G
37-
```
38-
39-
Four functions to build:
17+
## Step 1: Load the Dictionary File
4018

41-
1. **load_dictionary()** -- parse a file of `key=value` lines into a Python dict
42-
2. **lookup()** -- search for a term with fuzzy matching fallback
43-
3. **batch_lookup()** -- look up multiple terms at once
44-
4. **dictionary_stats()** -- compute summary statistics about the dictionary
19+
**What to do:** Write a function that reads a text file and builds a Python dict from `key=value` lines.
4520

46-
## Step 1: Loading the dictionary
47-
48-
The file has one entry per line in `key=value` format. Parse it into a Python dictionary using a dict comprehension:
21+
**Why:** Everything else depends on having the data in a Python dict. Get this right first, and the rest flows naturally.
4922

5023
```python
5124
def load_dictionary(path: Path) -> dict[str, str]:
@@ -62,17 +35,17 @@ def load_dictionary(path: Path) -> dict[str, str]:
6235

6336
Three details to notice:
6437

65-
- **`line.split("=", 1)`** splits on the **first** `=` only. This is critical because definitions might contain `=` signs (like `formula=E=mc2`). Without `maxsplit=1`, that would break into three parts.
66-
- **`.lower()` on keys** makes lookups case-insensitive. "Python", "python", and "PYTHON" all map to the same entry.
67-
- **`if "=" in line`** skips lines that do not have an `=`, like blank lines or comments.
38+
- **`line.split("=", 1)`** splits on the **first** `=` only. This is critical because definitions might contain `=` signs (like `formula=E=mc2`). Without the `1`, that would break into three parts.
39+
- **`.lower()` on keys** makes lookups case-insensitive.
40+
- **`if "=" in line`** skips blank lines and comments.
6841

69-
### Predict before you scroll
42+
**Predict:** If the file has two lines with the same key (e.g., `python=...` appears twice), which definition ends up in the dict? The first one or the last one?
7043

71-
If the file has two lines with the same key (e.g., `python=...` appears twice), which definition ends up in the dictionary? The first one or the last one?
44+
## Step 2: Look Up a Single Term
7245

73-
## Step 2: Looking up a term
46+
**What to do:** Write a `lookup()` function that searches the dictionary for a term and returns a structured result dict.
7447

75-
The lookup function has two paths: exact match (the happy path) and fuzzy match (the fallback).
48+
**Why:** Returning a structured dict (not just a string) means the caller can programmatically check `result["found"]` and decide what to do. This is better than returning `None` or raising an exception for missing terms.
7649

7750
```python
7851
import difflib
@@ -100,17 +73,15 @@ def lookup(dictionary: dict[str, str], term: str) -> dict:
10073
}
10174
```
10275

103-
The function uses `try/except KeyError` instead of `if term in dictionary`. Both work, but `try/except` is considered more "Pythonic" when you expect the key to exist most of the time. This is the **EAFP** pattern (Easier to Ask Forgiveness than Permission).
104-
105-
**`difflib.get_close_matches`** uses sequence matching to find similar strings. The `cutoff=0.6` means a match must be at least 60% similar. The `n=3` limits results to the three best matches.
76+
The function uses `try/except KeyError` instead of `if term in dictionary`. Both work, but `try/except` is considered more Pythonic when you expect the key to usually exist (the "happy path" is fast). This is the **EAFP** pattern (Easier to Ask Forgiveness than Permission).
10677

107-
### Predict before you scroll
78+
**Predict:** If the dictionary contains "python" and the user searches for "pythn" (a typo), will `get_close_matches` find it? What if they search for "xyz"?
10879

109-
If the dictionary contains "python" and the user searches for "pythn" (a typo), will `get_close_matches` find it? What if they search for "xyz"?
80+
## Step 3: Batch Lookup with Enumerate
11081

111-
## Step 3: Batch lookup
82+
**What to do:** Write a `batch_lookup()` function that processes a list of terms and tracks their original position using `enumerate()`.
11283

113-
Looking up multiple terms is a thin wrapper around `lookup()`:
84+
**Why:** When looking up multiple terms, the caller needs to know which result corresponds to which input. `enumerate` gives you the index alongside each item -- this is cleaner than manually tracking a counter variable.
11485

11586
```python
11687
def batch_lookup(dictionary: dict[str, str], terms: list[str]) -> list[dict]:
@@ -122,11 +93,13 @@ def batch_lookup(dictionary: dict[str, str], terms: list[str]) -> list[dict]:
12293
return results
12394
```
12495

125-
`enumerate()` provides both the index and the value. Adding the index to each result lets the caller track which position each term was in.
96+
**Predict:** If you pass `["Python", "PYTHON", "python"]`, how many unique lookups effectively happen? Are all three results identical?
12697

127-
## Step 4: Dictionary statistics
98+
## Step 4: Compute Dictionary Statistics
12899

129-
This function demonstrates **set operations** and **sorting with a key function**:
100+
**What to do:** Write a `dictionary_stats()` function that uses sets and `sorted()` with a key function.
101+
102+
**Why:** This step practices two important patterns: set comprehensions (for unique first letters) and sorting with a custom key function (sorting terms by definition length, not alphabetically).
130103

131104
```python
132105
def dictionary_stats(dictionary: dict[str, str]) -> dict:
@@ -146,34 +119,62 @@ def dictionary_stats(dictionary: dict[str, str]) -> dict:
146119
}
147120
```
148121

149-
The set comprehension `{k[0] for k in dictionary if k}` extracts the first character of every key. Since it is a set, duplicates are automatically removed.
122+
The set comprehension `{k[0] for k in dictionary if k}` extracts the first character of every key. Since it is a set, duplicates are automatically removed. The `if k` guard prevents a crash on empty-string keys.
123+
124+
**Predict:** What does `sorted(first_letters)` do that the set alone does not? (Hint: sets have no guaranteed order.)
150125

151-
`sorted(..., key=lambda k: len(dictionary[k]))` sorts keys by the length of their definitions. The `lambda` is an inline function that tells `sorted()` what value to compare.
126+
## Step 5: Wire Up the CLI
152127

153-
## Common mistakes
128+
**What to do:** Use `argparse` to create `--dict`, `--lookup`, and `--stats` command-line options, then call your functions from `main()`.
154129

155-
| Mistake | Why it happens | How to fix |
156-
|---------|---------------|------------|
130+
**Why:** A CLI makes your tool usable from the terminal. `argparse` handles parsing, validation, and help text so you do not have to write that boilerplate yourself.
131+
132+
```python
133+
def main() -> None:
134+
args = parse_args()
135+
dictionary = load_dictionary(Path(args.dict))
136+
137+
if args.stats:
138+
stats = dictionary_stats(dictionary)
139+
for key, value in stats.items():
140+
print(f" {key}: {value}")
141+
return
142+
143+
if args.lookup:
144+
results = batch_lookup(dictionary, args.lookup)
145+
else:
146+
samples = list(dictionary.keys())[:3] + ["nonexistent"]
147+
results = batch_lookup(dictionary, samples)
148+
149+
for r in results:
150+
term = r["term"]
151+
if r["found"]:
152+
print(f" {term}: {r['definition']}")
153+
else:
154+
print(f" {term}: not found — suggestions: {r['suggestions']}")
155+
```
156+
157+
**Predict:** What happens if the user runs the script with no `--lookup` and no `--stats`? Trace through the code to find out.
158+
159+
## Common Mistakes
160+
161+
| Mistake | Why It Happens | Fix |
162+
|---------|---------------|-----|
157163
| `line.split("=")` breaks definitions containing `=` | Default split divides on every `=` | Use `split("=", 1)` to split on first `=` only |
158-
| Lookup is case-sensitive | Forgetting to normalise | `.lower()` both the keys (at load time) and the search term |
164+
| Lookup is case-sensitive | Forgetting to normalize | `.lower()` both the keys (at load time) and the search term |
165+
| `dictionary[term]` crashes on missing key | Using direct access without handling | Either use `try/except KeyError` or `dictionary.get(term)` |
159166
| `get_close_matches` returns nothing useful | Cutoff is too high for the input | Lower the cutoff (try 0.5) or check that the dictionary has enough entries |
160-
| `dictionary[term]` crashes on missing key | Using direct access without checking | Either use `try/except KeyError` or `dictionary.get(term)` |
161167

162-
## Testing your solution
163-
164-
Run the tests from the project directory:
168+
## Testing Your Solution
165169

166170
```bash
167171
pytest -q
168172
```
169173

170-
The nine tests check:
171-
- Dictionary loading parses entries correctly
172-
- Exact lookups return the definition
173-
- Missing keys return `found: False` with suggestions
174-
- Batch lookup processes multiple terms
175-
- Stats compute correct totals and first letters
176-
- Edge cases: empty terms, duplicate keys, `=` in definitions
174+
Expected output:
175+
```text
176+
9 passed
177+
```
177178

178179
You can also test from the command line:
179180

@@ -182,7 +183,8 @@ python project.py --dict data/sample_input.txt --lookup python java haskell
182183
python project.py --dict data/sample_input.txt --stats
183184
```
184185

185-
## What to explore next
186+
## What You Learned
186187

187-
1. Add a `--add` option that appends a new `key=value` line to the dictionary file
188-
2. Implement a reverse lookup: given a word that appears in any definition, find which terms contain it
188+
- **Dict comprehensions** build dictionaries in a single expression, which is more readable than a loop when the logic is straightforward.
189+
- **`try/except KeyError` vs `dict.get()`** are two ways to handle missing keys -- `try/except` is better when you expect the key to usually exist (the happy path is fast), while `.get()` is better when missing keys are common.
190+
- **`difflib.get_close_matches`** provides fuzzy string matching using sequence similarity -- it compares character patterns, not meanings, so "pythn" matches "python" but "snake" does not.

0 commit comments

Comments
 (0)