Skip to content

Commit 482bc83

Browse files
travisjneumanclaude
andcommitted
feat: flesh out SOLUTION.md for all 15 level-2 projects
Add complete annotated solutions with WHY comments, design decisions tables, alternative approaches, and common pitfalls for all level-2 projects covering dictionary operations, data cleaning, error handling, deduplication, benchmarking, and the mini capstone. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ce4ff36 commit 482bc83

File tree

15 files changed

+3607
-588
lines changed

15 files changed

+3607
-588
lines changed
Lines changed: 187 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,206 @@
1-
# Solution: Level 2 / Project 01 - Dictionary Lookup Service
1+
# Dictionary Lookup Service — Annotated Solution
22

3-
> **STOP** — Have you attempted this project yourself first?
4-
>
5-
> Learning happens in the struggle, not in reading answers.
6-
> Spend at least 20 minutes trying before reading this solution.
7-
> If you are stuck, try the [Walkthrough](./WALKTHROUGH.md) first — it guides
8-
> your thinking without giving away the answer.
3+
> **STOP!** Try solving this yourself first. Use the [project README](./README.md) and [walkthrough](./WALKTHROUGH.md) before reading the solution.
94
105
---
116

12-
13-
## Complete solution
7+
## Complete Solution
148

159
```python
16-
# WHY load_dictionary: [explain the design reason]
17-
# WHY lookup: [explain the design reason]
18-
# WHY batch_lookup: [explain the design reason]
19-
# WHY dictionary_stats: [explain the design reason]
20-
# WHY parse_args: [explain the design reason]
21-
# WHY main: [explain the design reason]
22-
23-
# [paste the complete working solution here]
24-
# Include WHY comments on every non-obvious line.
10+
"""Dictionary Lookup Service — complete annotated solution."""
11+
12+
from __future__ import annotations
13+
14+
import argparse
15+
import json
16+
from pathlib import Path
17+
18+
# WHY: difflib is a stdlib module that provides fuzzy string matching.
19+
# We use it to suggest corrections when a lookup misses, which is a
20+
# much better user experience than just "not found".
21+
import difflib
22+
23+
24+
def load_dictionary(path: Path) -> dict[str, str]:
25+
"""Load a dictionary from a file of 'key=value' lines."""
26+
if not path.exists():
27+
raise FileNotFoundError(f"Dictionary file not found: {path}")
28+
29+
raw = path.read_text(encoding="utf-8").splitlines()
30+
31+
# WHY: Dict comprehension with split("=", 1) — the maxsplit=1 argument
32+
# ensures definitions containing '=' characters are preserved intact.
33+
# Without it, "url=https://a.com/b=c" would split into 3 parts and break.
34+
entries = {
35+
parts[0].strip().lower(): parts[1].strip()
36+
for line in raw
37+
if "=" in line
38+
for parts in [line.split("=", 1)]
39+
}
40+
return entries
41+
42+
43+
def lookup(dictionary: dict[str, str], term: str) -> dict:
44+
"""Look up a term with fuzzy matching fallback."""
45+
# WHY: Normalise to lowercase so "Python", "PYTHON", and "python"
46+
# all find the same entry — users should not need to know the exact case.
47+
normalised = term.strip().lower()
48+
49+
try:
50+
# WHY: Using dict[key] with try/except instead of dict.get() because
51+
# we want different behaviour for hit vs miss — try/except makes
52+
# the two paths explicit and easy to extend.
53+
definition = dictionary[normalised]
54+
return {
55+
"found": True,
56+
"term": normalised,
57+
"definition": definition,
58+
"suggestions": [],
59+
}
60+
except KeyError:
61+
# WHY: get_close_matches uses SequenceMatcher internally. The cutoff
62+
# of 0.6 means a word must be at least 60% similar to be suggested.
63+
# n=3 limits suggestions to the 3 best matches.
64+
suggestions = difflib.get_close_matches(
65+
normalised, dictionary.keys(), n=3, cutoff=0.6
66+
)
67+
return {
68+
"found": False,
69+
"term": normalised,
70+
"definition": None,
71+
"suggestions": suggestions,
72+
}
73+
74+
75+
def batch_lookup(
76+
dictionary: dict[str, str], terms: list[str]
77+
) -> list[dict]:
78+
"""Look up many terms, tracking original order with enumerate."""
79+
results = []
80+
for idx, term in enumerate(terms):
81+
result = lookup(dictionary, term)
82+
# WHY: Attaching the original index lets callers correlate results
83+
# back to input order, which matters for batch processing.
84+
result["index"] = idx
85+
results.append(result)
86+
return results
87+
88+
89+
def dictionary_stats(dictionary: dict[str, str]) -> dict:
90+
"""Compute statistics about the dictionary."""
91+
# WHY: Set comprehension collects unique first letters in O(n) time.
92+
# Sets automatically discard duplicates.
93+
first_letters: set[str] = {k[0] for k in dictionary if k}
94+
95+
# WHY: sorted() with a key function lets us rank entries by definition
96+
# length without modifying the original dict.
97+
sorted_by_length = sorted(
98+
dictionary.keys(),
99+
key=lambda k: len(dictionary[k]),
100+
reverse=True,
101+
)
102+
103+
return {
104+
"total_entries": len(dictionary),
105+
"unique_first_letters": sorted(first_letters),
106+
"longest_definitions": sorted_by_length[:5],
107+
"shortest_definitions": sorted_by_length[-5:],
108+
}
109+
110+
111+
def parse_args() -> argparse.Namespace:
112+
"""Parse command-line arguments."""
113+
parser = argparse.ArgumentParser(
114+
description="Dictionary lookup with fuzzy matching"
115+
)
116+
parser.add_argument(
117+
"--dict",
118+
default="data/sample_input.txt",
119+
help="Path to the dictionary file (key=value per line)",
120+
)
121+
parser.add_argument(
122+
"--lookup",
123+
nargs="*",
124+
default=[],
125+
help="Terms to look up",
126+
)
127+
parser.add_argument(
128+
"--stats",
129+
action="store_true",
130+
help="Print dictionary statistics",
131+
)
132+
return parser.parse_args()
133+
134+
135+
def main() -> None:
136+
"""Entry point: load dictionary, run lookups, print results."""
137+
args = parse_args()
138+
dictionary = load_dictionary(Path(args.dict))
139+
140+
if args.stats:
141+
stats = dictionary_stats(dictionary)
142+
print(f"=== Dictionary Statistics ===")
143+
for key, value in stats.items():
144+
print(f" {key}: {value}")
145+
return
146+
147+
if args.lookup:
148+
results = batch_lookup(dictionary, args.lookup)
149+
else:
150+
samples = list(dictionary.keys())[:3] + ["nonexistent"]
151+
results = batch_lookup(dictionary, samples)
152+
153+
print(f"=== Lookup Results ===\n")
154+
print(f" {'Term':<20} {'Found':>6} {'Definition / Suggestions'}")
155+
print(f" {'-'*20} {'-'*6} {'-'*30}")
156+
for r in results:
157+
term = r["term"]
158+
if r["found"]:
159+
print(f" {term:<20} {'yes':>6} {r['definition']}")
160+
else:
161+
suggestions = ", ".join(r.get("suggestions", []))
162+
hint = f"Did you mean: {suggestions}" if suggestions else "(no matches)"
163+
print(f" {term:<20} {'no':>6} {hint}")
164+
165+
166+
if __name__ == "__main__":
167+
main()
25168
```
26169

27-
## Design decisions
170+
## Design Decisions
28171

29-
| Decision | Why | Alternative considered |
30-
|----------|-----|----------------------|
31-
| load_dictionary function | [reason] | [alternative] |
32-
| lookup function | [reason] | [alternative] |
33-
| batch_lookup function | [reason] | [alternative] |
172+
| Decision | Why |
173+
|----------|-----|
174+
| `split("=", 1)` for parsing | Definitions may contain `=` characters (e.g. URLs). Splitting on only the first `=` preserves the full definition. |
175+
| `try/except KeyError` instead of `dict.get()` | The two code paths (found vs not found) are very different. Using try/except makes each path explicit and keeps the happy path clean. |
176+
| Lowercase normalisation on load | Case-insensitive lookups are the expected default. Normalising once at load time avoids doing it on every lookup. |
177+
| Fuzzy matching with `difflib` | Returning "did you mean?" suggestions transforms a dead-end miss into a helpful interaction, which is critical for user-facing tools. |
178+
| Returning structured dicts, not strings | Dicts are machine-readable. Callers can decide how to display results (table, JSON, GUI) without parsing strings. |
34179

35-
## Alternative approaches
180+
## Alternative Approaches
36181

37-
### Approach B: [Name]
182+
### Using `dict.get()` instead of `try/except`
38183

39184
```python
40-
# [Different valid approach with trade-offs explained]
185+
def lookup_with_get(dictionary, term):
186+
normalised = term.strip().lower()
187+
definition = dictionary.get(normalised)
188+
if definition is not None:
189+
return {"found": True, "term": normalised, "definition": definition}
190+
suggestions = difflib.get_close_matches(normalised, dictionary.keys())
191+
return {"found": False, "term": normalised, "suggestions": suggestions}
41192
```
42193

43-
**Trade-off:** [When you would prefer this approach vs the primary one]
194+
This is simpler and perfectly valid. The trade-off: `dict.get()` cannot distinguish between a key that maps to `None` and a missing key. In this project that does not matter (all values are strings), but the `try/except` pattern is more general and worth practicing.
195+
196+
### Using the `csv` module instead of manual parsing
197+
198+
For more complex dictionary files (quoted values, multi-line entries), Python's `csv` module handles edge cases automatically. The manual approach here is chosen because the file format is simple and it teaches `str.split()` mechanics directly.
44199

45-
## What could go wrong
200+
## Common Pitfalls
46201

47-
| Scenario | What happens | Prevention |
48-
|----------|-------------|------------|
49-
| [bad input] | [error/behavior] | [how to handle] |
50-
| [edge case] | [behavior] | [how to handle] |
202+
1. **Splitting on every `=` sign** — Using `line.split("=")` without the `maxsplit=1` argument will break definitions containing `=`. For example, `url=https://example.com/a=b` would incorrectly split into three parts instead of two.
51203

52-
## Key takeaways
204+
2. **Forgetting to normalise input** — If you normalise keys to lowercase at load time but forget to lowercase the search term, "Python" will not match "python". Always normalise both sides of a comparison.
53205

54-
1. [Most important lesson from this project]
55-
2. [Second lesson]
56-
3. [Connection to future concepts]
206+
3. **Bare `except:` instead of `except KeyError`** — Catching all exceptions hides real bugs (like `TypeError` from passing a non-string key). Always catch the most specific exception type you expect.

0 commit comments

Comments
 (0)