|
1 | 1 | # Solution: Level 7 / Project 01 - API Query Adapter |
2 | 2 |
|
3 | | -> **STOP** — Have you attempted this project yourself first? |
| 3 | +> **STOP — Try it yourself first!** |
4 | 4 | > |
5 | | -> Learning happens in the struggle, not in reading answers. |
6 | | -> Spend at least 20 minutes trying before reading this solution. |
7 | | -> If you are stuck, try the [Walkthrough](./WALKTHROUGH.md) first — it guides |
8 | | -> your thinking without giving away the answer. |
| 5 | +> You learn by building, not by reading answers. Spend at least 30 minutes |
| 6 | +> attempting this project before looking here. |
| 7 | +> |
| 8 | +> - Re-read the [README](./README.md) for requirements |
| 9 | +> - Try the [WALKTHROUGH](./WALKTHROUGH.md) for guided hints without spoilers |
9 | 10 |
|
10 | 11 | --- |
11 | 12 |
|
12 | | - |
13 | 13 | ## Complete solution |
14 | 14 |
|
15 | 15 | ```python |
16 | | -# WHY adapt_api_a: [explain the design reason] |
17 | | -# WHY adapt_api_b: [explain the design reason] |
18 | | -# WHY adapt_api_c: [explain the design reason] |
19 | | -# WHY adapt_response: [explain the design reason] |
20 | | -# WHY query_all_sources: [explain the design reason] |
21 | | -# WHY filter_records: [explain the design reason] |
22 | | -# WHY run: [explain the design reason] |
23 | | -# WHY parse_args: [explain the design reason] |
24 | | -# WHY main: [explain the design reason] |
25 | | - |
26 | | -# [paste the complete working solution here] |
27 | | -# Include WHY comments on every non-obvious line. |
| 16 | +"""Level 7 / Project 01 — API Query Adapter. |
| 17 | +
|
| 18 | +Adapts different API response formats into a unified schema. |
| 19 | +Uses simulated API responses (no network calls) to teach |
| 20 | +normalization patterns. |
| 21 | +""" |
| 22 | + |
| 23 | +from __future__ import annotations |
| 24 | + |
| 25 | +import argparse |
| 26 | +import json |
| 27 | +import logging |
| 28 | +import time |
| 29 | +from dataclasses import dataclass, field |
| 30 | +from pathlib import Path |
| 31 | +from typing import Any, Callable |
| 32 | + |
| 33 | + |
| 34 | +# --------------------------------------------------------------------------- |
| 35 | +# Unified schema |
| 36 | +# --------------------------------------------------------------------------- |
| 37 | + |
| 38 | +# WHY a dataclass for the unified record? -- Each API returns different field |
| 39 | +# names (item_id vs id vs sku). By funnelling everything into one shape, |
| 40 | +# downstream code only understands ONE interface. This is the Adapter pattern: |
| 41 | +# many inputs, one output contract. |
| 42 | +@dataclass |
| 43 | +class UnifiedRecord: |
| 44 | + id: str |
| 45 | + name: str |
| 46 | + value: float |
| 47 | + source: str |
| 48 | + timestamp: str |
| 49 | + |
| 50 | + |
| 51 | +# --------------------------------------------------------------------------- |
| 52 | +# Simulated API responses (mock data) |
| 53 | +# --------------------------------------------------------------------------- |
| 54 | + |
| 55 | +# WHY mock data instead of real HTTP? -- At this level we are learning the |
| 56 | +# *pattern*, not the network layer. Mocks let us test adapters in isolation |
| 57 | +# without flaky network dependencies or API keys. |
| 58 | +MOCK_API_A = [ |
| 59 | + {"item_id": "A-001", "item_name": "Widget", "price": 9.99, "ts": "2025-01-15T08:00:00"}, |
| 60 | + {"item_id": "A-002", "item_name": "Gadget", "price": 24.99, "ts": "2025-01-15T09:00:00"}, |
| 61 | +] |
| 62 | + |
| 63 | +MOCK_API_B = [ |
| 64 | + {"id": "B-001", "label": "Bolt Pack", "cost": 3.49, "created": "2025-01-15T10:00:00"}, |
| 65 | + {"id": "B-002", "label": "Nut Set", "cost": 2.99, "created": "2025-01-15T11:00:00"}, |
| 66 | +] |
| 67 | + |
| 68 | +MOCK_API_C = [ |
| 69 | + {"sku": "C-001", "title": "Spring", "amount": 1.50, "date": "2025-01-15T12:00:00"}, |
| 70 | +] |
| 71 | + |
| 72 | + |
| 73 | +# --------------------------------------------------------------------------- |
| 74 | +# Adapters — one per source, each maps source fields → UnifiedRecord |
| 75 | +# --------------------------------------------------------------------------- |
| 76 | + |
| 77 | +# WHY one function per API? -- Each source has its own quirks (field names, |
| 78 | +# nesting, optional fields). Isolating the mapping into its own function |
| 79 | +# means a change to API A's format only touches adapt_api_a — zero risk to |
| 80 | +# the other adapters. |
| 81 | +def adapt_api_a(raw: list[dict]) -> list[UnifiedRecord]: |
| 82 | + """Adapter for API A: uses item_id, item_name, price, ts.""" |
| 83 | + results = [] |
| 84 | + for r in raw: |
| 85 | + results.append(UnifiedRecord( |
| 86 | + id=r["item_id"], name=r["item_name"], |
| 87 | + value=r["price"], source="api_a", timestamp=r["ts"], |
| 88 | + )) |
| 89 | + return results |
| 90 | + |
| 91 | + |
| 92 | +def adapt_api_b(raw: list[dict]) -> list[UnifiedRecord]: |
| 93 | + """Adapter for API B: uses id, label, cost, created.""" |
| 94 | + results = [] |
| 95 | + for r in raw: |
| 96 | + results.append(UnifiedRecord( |
| 97 | + id=r["id"], name=r["label"], |
| 98 | + value=r["cost"], source="api_b", timestamp=r["created"], |
| 99 | + )) |
| 100 | + return results |
| 101 | + |
| 102 | + |
| 103 | +def adapt_api_c(raw: list[dict]) -> list[UnifiedRecord]: |
| 104 | + """Adapter for API C: uses sku, title, amount, date.""" |
| 105 | + results = [] |
| 106 | + for r in raw: |
| 107 | + results.append(UnifiedRecord( |
| 108 | + id=r["sku"], name=r["title"], |
| 109 | + value=r["amount"], source="api_c", timestamp=r["date"], |
| 110 | + )) |
| 111 | + return results |
| 112 | + |
| 113 | + |
| 114 | +# --------------------------------------------------------------------------- |
| 115 | +# Adapter registry |
| 116 | +# --------------------------------------------------------------------------- |
| 117 | + |
| 118 | +# WHY a registry dict instead of if/elif? -- Adding a new API means adding |
| 119 | +# one dict entry, not modifying control flow. The registry is also iterable, |
| 120 | +# so query_all_sources can loop over it without knowing adapter names in advance. |
| 121 | +ADAPTERS: dict[str, Callable[..., Any]] = { |
| 122 | + "api_a": adapt_api_a, |
| 123 | + "api_b": adapt_api_b, |
| 124 | + "api_c": adapt_api_c, |
| 125 | +} |
| 126 | + |
| 127 | + |
| 128 | +def adapt_response(source: str, raw: list[dict]) -> list[UnifiedRecord]: |
| 129 | + """Route raw data to the correct adapter by source name.""" |
| 130 | + adapter = ADAPTERS.get(source) |
| 131 | + if adapter is None: |
| 132 | + # WHY raise instead of silent skip? -- A missing adapter is a |
| 133 | + # configuration bug, not a data issue. Failing loudly prevents |
| 134 | + # silently dropping an entire source of records. |
| 135 | + raise ValueError(f"No adapter for source '{source}'. Available: {list(ADAPTERS.keys())}") |
| 136 | + return adapter(raw) |
| 137 | + |
| 138 | + |
| 139 | +# --------------------------------------------------------------------------- |
| 140 | +# Query engine |
| 141 | +# --------------------------------------------------------------------------- |
| 142 | + |
| 143 | +def query_all_sources( |
| 144 | + sources: dict[str, list[dict]] | None = None, |
| 145 | +) -> list[UnifiedRecord]: |
| 146 | + """Query all configured sources and merge into unified records.""" |
| 147 | + if sources is None: |
| 148 | + sources = {"api_a": MOCK_API_A, "api_b": MOCK_API_B, "api_c": MOCK_API_C} |
| 149 | + |
| 150 | + all_records: list[UnifiedRecord] = [] |
| 151 | + for source_name, raw_data in sources.items(): |
| 152 | + try: |
| 153 | + records = adapt_response(source_name, raw_data) |
| 154 | + all_records.extend(records) |
| 155 | + logging.info("adapted source=%s records=%d", source_name, len(records)) |
| 156 | + except (KeyError, ValueError) as exc: |
| 157 | + # WHY catch and continue? -- One broken source should not prevent |
| 158 | + # the other sources from being processed. Log the error so |
| 159 | + # operators can investigate, but keep the pipeline running. |
| 160 | + logging.warning("skip source=%s error=%s", source_name, exc) |
| 161 | + |
| 162 | + return all_records |
| 163 | + |
| 164 | + |
| 165 | +def filter_records( |
| 166 | + records: list[UnifiedRecord], |
| 167 | + min_value: float | None = None, |
| 168 | + source: str | None = None, |
| 169 | +) -> list[UnifiedRecord]: |
| 170 | + """Filter unified records by optional criteria.""" |
| 171 | + result = records |
| 172 | + if min_value is not None: |
| 173 | + result = [r for r in result if r.value >= min_value] |
| 174 | + if source is not None: |
| 175 | + result = [r for r in result if r.source == source] |
| 176 | + return result |
| 177 | + |
| 178 | + |
| 179 | +# --------------------------------------------------------------------------- |
| 180 | +# Orchestrator |
| 181 | +# --------------------------------------------------------------------------- |
| 182 | + |
| 183 | +def run(input_path: Path, output_path: Path) -> dict: |
| 184 | + """Load source config, adapt all APIs, write unified output.""" |
| 185 | + if input_path.exists(): |
| 186 | + config = json.loads(input_path.read_text(encoding="utf-8")) |
| 187 | + sources = config.get("sources", None) |
| 188 | + else: |
| 189 | + sources = None # WHY fallback? -- Use built-in mocks when no config file |
| 190 | + |
| 191 | + start = time.perf_counter() |
| 192 | + records = query_all_sources(sources) |
| 193 | + elapsed_ms = round((time.perf_counter() - start) * 1000, 1) |
| 194 | + |
| 195 | + summary = { |
| 196 | + "total_records": len(records), |
| 197 | + "sources_queried": len(sources) if sources else 3, |
| 198 | + "elapsed_ms": elapsed_ms, |
| 199 | + "records": [ |
| 200 | + {"id": r.id, "name": r.name, "value": r.value, |
| 201 | + "source": r.source, "timestamp": r.timestamp} |
| 202 | + for r in records |
| 203 | + ], |
| 204 | + } |
| 205 | + |
| 206 | + output_path.parent.mkdir(parents=True, exist_ok=True) |
| 207 | + output_path.write_text(json.dumps(summary, indent=2), encoding="utf-8") |
| 208 | + logging.info("adapted %d records in %.1fms", len(records), elapsed_ms) |
| 209 | + return summary |
| 210 | + |
| 211 | + |
| 212 | +# --------------------------------------------------------------------------- |
| 213 | +# CLI |
| 214 | +# --------------------------------------------------------------------------- |
| 215 | + |
| 216 | +def parse_args() -> argparse.Namespace: |
| 217 | + parser = argparse.ArgumentParser( |
| 218 | + description="API Query Adapter — normalize multiple API formats" |
| 219 | + ) |
| 220 | + parser.add_argument("--input", default="data/sample_input.json") |
| 221 | + parser.add_argument("--output", default="data/output_summary.json") |
| 222 | + parser.add_argument("--run-id", default="manual-run") |
| 223 | + return parser.parse_args() |
| 224 | + |
| 225 | + |
| 226 | +def main() -> None: |
| 227 | + logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s") |
| 228 | + args = parse_args() |
| 229 | + summary = run(Path(args.input), Path(args.output)) |
| 230 | + print(json.dumps(summary, indent=2)) |
| 231 | + |
| 232 | + |
| 233 | +if __name__ == "__main__": |
| 234 | + main() |
28 | 235 | ``` |
29 | 236 |
|
30 | 237 | ## Design decisions |
31 | 238 |
|
32 | 239 | | Decision | Why | Alternative considered | |
33 | 240 | |----------|-----|----------------------| |
34 | | -| adapt_api_a function | [reason] | [alternative] | |
35 | | -| adapt_api_b function | [reason] | [alternative] | |
36 | | -| adapt_api_c function | [reason] | [alternative] | |
| 241 | +| Dataclass for `UnifiedRecord` | Typed fields catch mismatches at construction time; immutable-feeling shape communicates the contract clearly | Plain dict -- flexible but no field-name typo protection | |
| 242 | +| Registry dict for adapters | Open/closed principle -- add new sources without modifying dispatch logic | if/elif chain -- works but every new source modifies the same function | |
| 243 | +| `try/except` around each source in `query_all_sources` | One broken source should not take down the whole pipeline | Fail-fast -- simpler but less resilient in production | |
| 244 | +| Mock data instead of HTTP | Focuses on the pattern (normalization), not the transport layer | `responses` or `httpx` mock -- realistic but adds dependencies | |
37 | 245 |
|
38 | 246 | ## Alternative approaches |
39 | 247 |
|
40 | | -### Approach B: [Name] |
| 248 | +### Approach B: Class-based adapters with a common Protocol |
41 | 249 |
|
42 | 250 | ```python |
43 | | -# [Different valid approach with trade-offs explained] |
| 251 | +from typing import Protocol |
| 252 | + |
| 253 | +class SourceAdapter(Protocol): |
| 254 | + def adapt(self, raw: list[dict]) -> list[UnifiedRecord]: ... |
| 255 | + |
| 256 | +class ApiAAdapter: |
| 257 | + def adapt(self, raw: list[dict]) -> list[UnifiedRecord]: |
| 258 | + return [UnifiedRecord(id=r["item_id"], ...) for r in raw] |
44 | 259 | ``` |
45 | 260 |
|
46 | | -**Trade-off:** [When you would prefer this approach vs the primary one] |
| 261 | +**Trade-off:** Class-based adapters are better when each source needs its own state (auth tokens, pagination cursors). The function-based approach here is simpler because we have no per-source state. |
47 | 262 |
|
48 | | -## What could go wrong |
| 263 | +## Common pitfalls |
49 | 264 |
|
50 | 265 | | Scenario | What happens | Prevention | |
51 | 266 | |----------|-------------|------------| |
52 | | -| [bad input] | [error/behavior] | [how to handle] | |
53 | | -| [edge case] | [behavior] | [how to handle] | |
54 | | - |
55 | | -## Key takeaways |
56 | | - |
57 | | -1. [Most important lesson from this project] |
58 | | -2. [Second lesson] |
59 | | -3. [Connection to future concepts] |
| 267 | +| Source returns a field with a new name after an API update | `KeyError` crashes the adapter for that source | Wrap field access in `.get()` with a default, or catch `KeyError` per record | |
| 268 | +| Two sources return records with the same `id` | Downstream consumers silently get duplicate IDs | Add a dedup step or prefix IDs with the source name (e.g. `api_a:A-001`) | |
| 269 | +| A source returns an empty list | No crash, but `total_records` may be misleadingly low | Log a warning when a source returns zero records so operators notice | |
0 commit comments