You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add step-by-step WALKTHROUGH.md for 12 key projects (levels 2-5)
Create pedagogical walkthroughs for 3 projects per level (first, mid,
capstone) across levels 2-5. Each walkthrough guides learner thinking
with structured steps, code snippets, "Predict" prompts, common
mistakes tables, and key takeaways. Level 2 existing walkthroughs
updated to match the consistent template format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Dictionary Lookup Service — Step-by-Step Walkthrough
2
2
3
-
> This guide walks through the **thinking process** for building this project.
4
-
> It does NOT give you the complete solution. For that, see [SOLUTION.md](./SOLUTION.md).
3
+
[<- Back to Project README](./README.md) · [Solution](./SOLUTION.md)
5
4
6
-
## Before reading this
5
+
## Before You Start
7
6
8
-
**Try the project yourself first.** Spend at least 20 minutes.
9
-
If you have not tried yet, close this file and open the [project README](./README.md).
7
+
Read the [project README](./README.md) first. Try to solve it on your own before following this guide. Spend at least 20 minutes attempting it independently.
10
8
11
-
---
9
+
## Thinking Process
12
10
13
-
## Understanding the problem
11
+
When you see "dictionary lookup service," your first question should be: where does the data come from, and what format is it in? The data lives in a text file with one `key=value` pair per line. So the first job is parsing that file into a Python dict. Think about what could go wrong at this stage -- definitions that contain `=` signs, duplicate keys, and inconsistent capitalization.
14
12
15
-
You need to build a dictionary lookup tool that loads `key=value` pairs from a file, looks up terms (case-insensitively), and suggests close matches when a term is not found. It also provides statistics about the dictionary. Think of it like a simplified spell-check-enabled glossary.
13
+
Next, think about what happens when someone searches for a term that does not exist. You could just say "not found," but a better experience is to suggest close matches. This is where the `difflib` module comes in -- it can find strings that are similar to the search term. Think of it like a spell checker: you type "pythn" and it says "did you mean python?"
16
14
17
-
The dictionary file looks like:
15
+
Finally, consider normalization. If the dictionary has "Python" and the user searches for "python," should that match? Almost certainly yes. Normalizing to lowercase at both load time and search time solves this cleanly.
18
16
19
-
```
20
-
python=A high-level programming language
21
-
javascript=A language for web development
22
-
html=HyperText Markup Language
23
-
```
24
-
25
-
## Planning before code
26
-
27
-
```mermaid
28
-
flowchart TD
29
-
A[Load dictionary file] --> B[load_dictionary: parse key=value lines into dict]
30
-
B --> C{User action?}
31
-
C -->|--lookup term| D[lookup: exact match or fuzzy suggestions]
32
-
C -->|--stats| E[dictionary_stats: count entries, analyse keys]
33
-
C -->|batch lookup| F[batch_lookup: look up many terms]
34
-
D --> G[Print structured result]
35
-
F --> G
36
-
E --> G
37
-
```
38
-
39
-
Four functions to build:
17
+
## Step 1: Load the Dictionary File
40
18
41
-
1.**load_dictionary()** -- parse a file of `key=value` lines into a Python dict
42
-
2.**lookup()** -- search for a term with fuzzy matching fallback
43
-
3.**batch_lookup()** -- look up multiple terms at once
44
-
4.**dictionary_stats()** -- compute summary statistics about the dictionary
19
+
**What to do:** Write a function that reads a text file and builds a Python dict from `key=value` lines.
45
20
46
-
## Step 1: Loading the dictionary
47
-
48
-
The file has one entry per line in `key=value` format. Parse it into a Python dictionary using a dict comprehension:
21
+
**Why:** Everything else depends on having the data in a Python dict. Get this right first, and the rest flows naturally.
-**`line.split("=", 1)`** splits on the **first**`=` only. This is critical because definitions might contain `=` signs (like `formula=E=mc2`). Without `maxsplit=1`, that would break into three parts.
66
-
-**`.lower()` on keys** makes lookups case-insensitive. "Python", "python", and "PYTHON" all map to the same entry.
67
-
-**`if "=" in line`** skips lines that do not have an `=`, like blank lines or comments.
38
+
-**`line.split("=", 1)`** splits on the **first**`=` only. This is critical because definitions might contain `=` signs (like `formula=E=mc2`). Without the `1`, that would break into three parts.
39
+
-**`.lower()` on keys** makes lookups case-insensitive.
40
+
-**`if "=" in line`** skips blank lines and comments.
68
41
69
-
### Predict before you scroll
42
+
**Predict:** If the file has two lines with the same key (e.g., `python=...` appears twice), which definition ends up in the dict? The first one or the last one?
70
43
71
-
If the file has two lines with the same key (e.g., `python=...` appears twice), which definition ends up in the dictionary? The first one or the last one?
44
+
## Step 2: Look Up a Single Term
72
45
73
-
## Step 2: Looking up a term
46
+
**What to do:** Write a `lookup()` function that searches the dictionary for a term and returns a structured result dict.
74
47
75
-
The lookup function has two paths: exact match (the happy path) and fuzzy match (the fallback).
48
+
**Why:** Returning a structured dict (not just a string) means the caller can programmatically check `result["found"]`and decide what to do. This is better than returning `None` or raising an exception for missing terms.
The function uses `try/except KeyError` instead of `if term in dictionary`. Both work, but `try/except` is considered more "Pythonic" when you expect the key to exist most of the time. This is the **EAFP** pattern (Easier to Ask Forgiveness than Permission).
104
-
105
-
**`difflib.get_close_matches`** uses sequence matching to find similar strings. The `cutoff=0.6` means a match must be at least 60% similar. The `n=3` limits results to the three best matches.
76
+
The function uses `try/except KeyError` instead of `if term in dictionary`. Both work, but `try/except` is considered more Pythonic when you expect the key to usually exist (the "happy path" is fast). This is the **EAFP** pattern (Easier to Ask Forgiveness than Permission).
106
77
107
-
### Predict before you scroll
78
+
**Predict:** If the dictionary contains "python" and the user searches for "pythn" (a typo), will `get_close_matches` find it? What if they search for "xyz"?
108
79
109
-
If the dictionary contains "python" and the user searches for "pythn" (a typo), will `get_close_matches` find it? What if they search for "xyz"?
80
+
## Step 3: Batch Lookup with Enumerate
110
81
111
-
## Step 3: Batch lookup
82
+
**What to do:** Write a `batch_lookup()` function that processes a list of terms and tracks their original position using `enumerate()`.
112
83
113
-
Looking up multiple terms is a thin wrapper around `lookup()`:
84
+
**Why:** When looking up multiple terms, the caller needs to know which result corresponds to which input. `enumerate` gives you the index alongside each item -- this is cleaner than manually tracking a counter variable.
`enumerate()` provides both the index and the value. Adding the index to each result lets the caller track which position each term was in.
96
+
**Predict:** If you pass `["Python", "PYTHON", "python"]`, how many unique lookups effectively happen? Are all three results identical?
126
97
127
-
## Step 4: Dictionary statistics
98
+
## Step 4: Compute Dictionary Statistics
128
99
129
-
This function demonstrates **set operations** and **sorting with a key function**:
100
+
**What to do:** Write a `dictionary_stats()` function that uses sets and `sorted()` with a key function.
101
+
102
+
**Why:** This step practices two important patterns: set comprehensions (for unique first letters) and sorting with a custom key function (sorting terms by definition length, not alphabetically).
The set comprehension `{k[0] for k in dictionary if k}` extracts the first character of every key. Since it is a set, duplicates are automatically removed.
122
+
The set comprehension `{k[0] for k in dictionary if k}` extracts the first character of every key. Since it is a set, duplicates are automatically removed. The `if k` guard prevents a crash on empty-string keys.
123
+
124
+
**Predict:** What does `sorted(first_letters)` do that the set alone does not? (Hint: sets have no guaranteed order.)
150
125
151
-
`sorted(..., key=lambda k: len(dictionary[k]))` sorts keys by the length of their definitions. The `lambda` is an inline function that tells `sorted()` what value to compare.
126
+
## Step 5: Wire Up the CLI
152
127
153
-
## Common mistakes
128
+
**What to do:** Use `argparse` to create `--dict`, `--lookup`, and `--stats` command-line options, then call your functions from `main()`.
154
129
155
-
| Mistake | Why it happens | How to fix |
156
-
|---------|---------------|------------|
130
+
**Why:** A CLI makes your tool usable from the terminal. `argparse` handles parsing, validation, and help text so you do not have to write that boilerplate yourself.
print(f"{term}: not found — suggestions: {r['suggestions']}")
155
+
```
156
+
157
+
**Predict:** What happens if the user runs the script with no `--lookup` and no `--stats`? Trace through the code to find out.
158
+
159
+
## Common Mistakes
160
+
161
+
| Mistake | Why It Happens | Fix |
162
+
|---------|---------------|-----|
157
163
|`line.split("=")` breaks definitions containing `=`| Default split divides on every `=`| Use `split("=", 1)` to split on first `=` only |
158
-
| Lookup is case-sensitive | Forgetting to normalise |`.lower()` both the keys (at load time) and the search term |
164
+
| Lookup is case-sensitive | Forgetting to normalize |`.lower()` both the keys (at load time) and the search term |
165
+
|`dictionary[term]` crashes on missing key | Using direct access without handling | Either use `try/except KeyError` or `dictionary.get(term)`|
159
166
|`get_close_matches` returns nothing useful | Cutoff is too high for the input | Lower the cutoff (try 0.5) or check that the dictionary has enough entries |
160
-
|`dictionary[term]` crashes on missing key | Using direct access without checking | Either use `try/except KeyError` or `dictionary.get(term)`|
161
167
162
-
## Testing your solution
163
-
164
-
Run the tests from the project directory:
168
+
## Testing Your Solution
165
169
166
170
```bash
167
171
pytest -q
168
172
```
169
173
170
-
The nine tests check:
171
-
- Dictionary loading parses entries correctly
172
-
- Exact lookups return the definition
173
-
- Missing keys return `found: False` with suggestions
174
-
- Batch lookup processes multiple terms
175
-
- Stats compute correct totals and first letters
176
-
- Edge cases: empty terms, duplicate keys, `=` in definitions
1. Add a `--add` option that appends a new `key=value` line to the dictionary file
188
-
2. Implement a reverse lookup: given a word that appears in any definition, find which terms contain it
188
+
-**Dict comprehensions** build dictionaries in a single expression, which is more readable than a loop when the logic is straightforward.
189
+
-**`try/except KeyError` vs `dict.get()`** are two ways to handle missing keys -- `try/except` is better when you expect the key to usually exist (the happy path is fast), while `.get()` is better when missing keys are common.
190
+
-**`difflib.get_close_matches`** provides fuzzy string matching using sequence similarity -- it compares character patterns, not meanings, so "pythn" matches "python" but "snake" does not.
0 commit comments