Skip to content

Baseline test improvements: snapshots, local data, docstrings, infra cleanup#140

Merged
JerBouma merged 1 commit into
JerBouma:mainfrom
dokson:feature/pretty-print-json-snapshots
May 19, 2026
Merged

Baseline test improvements: snapshots, local data, docstrings, infra cleanup#140
JerBouma merged 1 commit into
JerBouma:mainfrom
dokson:feature/pretty-print-json-snapshots

Conversation

@dokson
Copy link
Copy Markdown
Contributor

@dokson dokson commented May 18, 2026

Summary

A baseline cleanup of the test suite so future data and code PRs are easier to develop, review, and triage. All changes live in tests/ (plus a one-character typo fix in database/etfs.csv that surfaced while validating the new local-data setup). All 32 tests pass; black + pyright clean.

Three structural problems were addressed, each motivating one of the commit's logical pieces:

  1. CI was testing main, not the PR. The library defaults to use_local_location=False, fetching compression/* over HTTP from the main branch. Tests on a PR therefore validated main's data instead of the PR's. Result: data-breaking PRs passed silently, data-fixing PRs (like ASE exchange fix + ISIN/FIGI backfill from public data #143) showed red CI for changes already correct on disk.
  2. Snapshot diffs were unreadable. JSON snapshots were single-line 1–2 KB strings; CSV mismatch failures truncated both sides at 500 chars and often appeared identical in the visible window (we hit this on Fix 1,084 placeholder names + 614 country + 35 exchange from yfinance #141 and ASE exchange fix + ISIN/FIGI backfill from public data #143). There was no usable signal in the failure message.
  3. The test runner had hidden lint debt. # pylint: skip-file in conftest.py masked real type issues; markers (record_stdout) were not declared; module docstrings were copy-paste wrong (test_equities.py and test_moneymarkets.py both said """Currencies Test Module"""); function docstrings were globally disabled via # pylint: disable=missing-function-docstring.

1. Tests now read the PR branch's data

Two cooperating changes:

(a) use_local_location=True in every tests/test_*.py. Module-level instantiations become etfs = fd.ETFs(use_local_location=True) etc., so the library reads compression/* from the local checkout rather than HTTP-fetching main.

(b) tests/conftest.py regenerates compression at import time. Mirrors the production Database Update workflow exactly:

df.to_csv(f"compression/{asset}.bz2", index=False, compression="bz2")
# + the per-asset categories.gzip pivot

It runs as a top-level statement (not a fixture), because the asset-class test modules instantiate fd.X(use_local_location=True) at import time — a @pytest.fixture(scope="session", autouse=True) would run too late.

(c) Clean working tree after pytest. Auto-regenerating compression artifacts on every run would otherwise leave 14 modified binary files in git status. To prevent that, the conftest snapshots the original bytes first and registers a pytest_sessionfinish hook that restores them at the end of the session. Net effect: contributors run pytest tests/ and see no spurious diff afterwards.

Why pytest_sessionfinish and not atexit: both fire on normal exit and on SIGINT (pytest forwards). pytest_sessionfinish is idiomatic pytest, doesn't fire if conftest.py is imported outside a pytest session (e.g. by an IDE static analyzer), and signals intent better.

2. Snapshot diffs are now readable

JSON: one entry per line

- string_value = json.dumps(data, **kwargs)
+ string_value = json.dumps(data, indent=2, ensure_ascii=False, **kwargs)

Diffs go from one giant string change to one-line-per-entry, and non-ASCII characters render natively (Côte d'Ivoire instead of escaped sequences).

Unified-diff on assertion failure

Recorder.assert_equal() was raising:

AssertionError: Change detected
Record    : tests/csv/test_etfs/test_select_4.csv
Expected  : symbol,name,currency,…  AAA,AAF First Priority CLO Bond ETF,USD,…
Actual    : symbol,name,currency,…  AAA,AAF First Priority CLO Bond ETF,USD,…

— both strings truncated at 500 chars, often appearing identical in the visible part. Now produces a difflib.unified_diff showing exactly which line(s) changed. Recorder.assert_in_list gained a proper assertion message too.

3. Test module quality (incidental cleanup)

While touching every test file for (1) and (2):

  • Module docstrings fixed. test_equities.py and test_moneymarkets.py both started with """Currencies Test Module""" (copy-paste leftover). Corrected to the right asset class for all 7 files.
  • Function docstrings added. Replaced # pylint: disable=missing-function-docstring with real one-line descriptions on every test function in the 7 asset-class files and in test_sec_enrichment_controller.py (which had none on its 10 functions).
  • Type hints added. Signatures are now (recorder: Recorder) -> None. Recorder import sits under TYPE_CHECKING and is guarded by from __future__ import annotations — zero runtime cost.

4. conftest.py cleanup

  • Removed # pylint: skip-file so the file is now subject to static analysis.
  • Cast request.config.getoption("--rewrite-expected") through bool(...) to satisfy the declared -> bool annotation (was Any | None).
  • The remaining Recorder class is unchanged in behaviour; only the JSON pretty-print and difflib-based diff are real semantic improvements.

5. Test infrastructure config (pyproject.toml)

  • Declared [tool.pytest.ini_options]: testpaths = ["tests"], addopts = "--strict-markers", and the record_stdout marker so any future PR introducing an undeclared marker fails loudly instead of emitting a warning.
  • poetry.lock regenerated (Poetry 2.4.1) to match the markers config addition. Both pytest-recording (provides disable_recording / VCR fixtures) and pytest-recorder (provides record_stdout / record_time) are kept — they supply complementary fixture sets both used by conftest.py.

6. Side fix surfaced by the new local-data tests

ASPY (Leverage Shares ASYMmetric 500 ETF) had family = "ASYMshsare" — a typo in database/etfs.csv that was previously hidden by the old test pattern (etfs.select(family="ASYMshsare") matched the typo'd value in the bz2 generated from the same CSV). Corrected to ASYMshares; the affected references in tests/test_etfs.py and the 3 snapshot files were updated. The maintainer's Update Compression Files workflow will regenerate the binary artifacts on main post-merge.

Diff shape

Bucket Files Notes
tests/conftest.py 1 compression regen + restore + difflib + pylint skip removed + bool cast
tests/test_*.py 8 docstrings + type hints + use_local_location=True + ASPY family fix
tests/csv/ + tests/json/ snapshots 67 regen with new JSON formatting + ASPY family fix
pyproject.toml + poetry.lock 2 markers + testpaths + strict-markers + lock refresh
database/etfs.csv 1 ASYMshsare → ASYMshares

Compression artifacts are NOT committed. They are deterministically derived from the CSVs by the conftest top-level call, regenerated on every test run, and restored to their pre-test state by pytest_sessionfinish.

What this PR explicitly does NOT do (deferred to focused follow-ups)

  • @pytest.mark.parametrize / class-based grouping — would change pytest node IDs and break Recorder.capture() positional snapshot naming (renames 90+ snapshot files in one go).
  • pytest-xdist parallel execution — adds a new dev dep; pure perf win.
  • Wiring pytest-cov (already in dev deps) into CI for coverage reporting; current overall coverage is 78%.
  • Filling the 22% coverage gap on financedatabase/helpers.py (FinanceFrame.to_toolkit, show_options URL error path, case_sensitive=True branch, __init__ RequestException path).
  • Adding behavioural asserts alongside snapshots (assert not df.empty, assert "country" in df.columns) — currently all assertions are snapshot-based; data regressions returning empty DataFrames would still pass.

Test plan

  • pytest tests/ — all 32 tests pass
  • After pytest tests/, git status shows no modified files (compression artifacts restored)
  • black --check tests/ — clean
  • CI categorization + markdown linters pass
  • Spot-check a JSON snapshot (tests/json/test_equities/test_show_options_1.json) is one-entry-per-line
  • Spot-check that PRs modifying database/*.csv would now exercise the new data through the library (the whole point of (1))

@dokson dokson marked this pull request as draft May 18, 2026 16:09
@dokson dokson force-pushed the feature/pretty-print-json-snapshots branch 2 times, most recently from d61e9c5 to 89e95f4 Compare May 18, 2026 16:15
@dokson dokson changed the title Pretty-print JSON test snapshots for readable diffs Baseline test improvements: pretty-print snapshots, unified-diff failures, docstrings May 18, 2026
@dokson dokson force-pushed the feature/pretty-print-json-snapshots branch 4 times, most recently from 5eefd27 to 7c998fe Compare May 18, 2026 22:38
@dokson dokson changed the title Baseline test improvements: pretty-print snapshots, unified-diff failures, docstrings Baseline test improvements: snapshots, local data, docstrings, infra cleanup May 18, 2026
@dokson dokson force-pushed the feature/pretty-print-json-snapshots branch 2 times, most recently from 7220239 to 246f904 Compare May 18, 2026 22:46
…ures, docstrings, local data

Sets a cleaner baseline for the test suite so future data and code PRs
are easier to develop, review, and triage. All changes confined to
`tests/` (plus a one-character typo fix in `database/etfs.csv`).
**All 32 tests pass.**

## Snapshot diffs that actually tell you what changed

The `test_show_options*.json` snapshots are JSON arrays of category labels (countries, sectors, currencies, …) — single-line strings of 1-2 KB. When a database update added/removed one entry, the GitHub diff was one giant string change with no way to see what differed.

```diff
- string_value = json.dumps(data, **kwargs)
+ string_value = json.dumps(data, indent=2, ensure_ascii=False, **kwargs)
```

`ensure_ascii=False` also means non-ASCII characters render natively (`Côte d'Ivoire` instead of escape sequences).

## Unified-diff on snapshot mismatch

`Recorder.assert_equal()` previously raised a `AssertionError: Change detected` with two truncated 500-char strings, often appearing identical in the visible part even when they actually differed later. We hit this exact issue while developing JerBouma#141 and JerBouma#143.

Now `assert_equal()` produces a proper `difflib.unified_diff` showing which line(s) changed; `assert_in_list` gained a proper assertion message.

## Tests now read the PR branch's data, not main

The library defaults to fetching `compression/*` from the GitHub `main` branch over HTTP. This meant tests on a PR validated the data on `main`, not the data in the PR — silently passing data-breaking PRs and failing on data-fixing PRs.

Two cooperating fixes:
- Every `tests/test_*.py` now instantiates with `use_local_location=True` so the library reads the local checkout.
- `tests/conftest.py` regenerates `compression/*.bz2` and `compression/categories/*.gzip` from the checked-out `database/*.csv` at import time, mirroring the production `database_update.yml` workflow. This must run *before* pytest collects the test modules (test-module imports instantiate `fd.X(use_local_location=True)`), so it is a top-level statement, not a fixture.

The compression artifacts themselves are not committed in this PR — they are deterministically derived from the CSVs and the test suite regenerates them on every run.

## Test module quality (incidental cleanup)

Since the regen touched every test file, took the opportunity to:
- Fix wrong module docstrings (`test_equities.py` and `test_moneymarkets.py` both started with `"""Currencies Test Module"""`).
- Replace `# pylint: disable=missing-function-docstring` with real docstrings on every test function in the 7 asset-class files and in `test_sec_enrichment_controller.py`.
- Add type hints `(recorder: Recorder) -> None`; `Recorder` import under `TYPE_CHECKING`, guarded by `from __future__ import annotations` — zero runtime cost.

## conftest.py cleanup

- Removed `# pylint: skip-file` so static analysis is no longer suppressed.
- Cast `request.config.getoption("--rewrite-expected")` return through `bool(...)` to satisfy the declared `-> bool` annotation (previously `Any | None`).

## Test infrastructure config (pyproject.toml)

- Dropped `pytest-recording` from dev deps — unused; the real dep used by `conftest.py` is `pytest-recorder` (which provides the `record_mode` / `disable_recording` fixtures).
- Declared `[tool.pytest.ini_options]` `testpaths = ["tests"]`, `addopts = "--strict-markers"`, and the `record_stdout` marker so future PRs can't silently introduce undeclared markers.

## Side fix: ASYMshsare → ASYMshares (etfs.csv + tests)

While running the new local-data tests we noticed `ASPY` (Leverage Shares ASYMmetric 500 ETF) had `family = "ASYMshsare"` — a typo in `database/etfs.csv`. Corrected to `ASYMshares` along with the references in `tests/test_etfs.py` and the affected fixture snapshots. The maintainer's `Update Compression Files` workflow will regenerate the binary artifacts on `main` post-merge.

## What this PR explicitly does NOT do

Deferred to focused follow-ups (out of scope here):
- `@pytest.mark.parametrize` / class-based grouping — would change pytest node IDs and break `Recorder.capture()` positional snapshot naming (renames 90+ files in one go).
- `pytest-xdist` parallel execution — adds a new dev dep; pure perf win.
- Wiring `pytest-cov` (already in dev deps) into CI for coverage reporting.
- Filling the 22% coverage gap on `financedatabase/helpers.py` (`FinanceFrame.to_toolkit`, `show_options` URL error path, `case_sensitive=True`).
@dokson dokson force-pushed the feature/pretty-print-json-snapshots branch from 246f904 to 4438a46 Compare May 18, 2026 22:47
@dokson dokson marked this pull request as ready for review May 18, 2026 22:49
@JerBouma
Copy link
Copy Markdown
Owner

Understand the changes, fine by me!

@JerBouma JerBouma merged commit 0771719 into JerBouma:main May 19, 2026
5 checks passed
@dokson dokson deleted the feature/pretty-print-json-snapshots branch May 19, 2026 06:45
JerBouma pushed a commit that referenced this pull request May 19, 2026
…age + cov in CI (#146)

Closes the test-infra follow-ups deferred in #140. All four items below
land in this single PR. Coverage 78% -> 92%, tests 32 -> 49.

## 1. `test_exchange_market_one_to_one` now uses the library

After #140, `tests/test_equities.py` instantiates
`fd.Equities(use_local_location=True)` and the conftest regenerates
compression from CSV at import time. The library output is therefore
in sync with `database/equities.csv` for the test session, so the
consistency invariant can be expressed in the same idiom as the
surrounding tests:

```diff
-    import pandas as pd
-    from pathlib import Path
-    csv_path = Path(__file__).resolve().parents[1] / "database" / "equities.csv"
-    df = pd.read_csv(csv_path, dtype=str)
+    df = equities.select()
```

## 2. helpers.py / asset-class coverage gap (was 45% on helpers.py -> 86%)

Added small targeted tests for the previously-uncovered branches found
by `pytest --cov=financedatabase --cov-report=term-missing`. All pure
asserts (no recorder), following the `test_sec_enrichment_controller.py`
precedent.

In `tests/test_equities.py`:

| New test | Branch covered |
|---|---|
| `test_search_with_list_of_index` | `search(index=[…])` list-membership |
| `test_search_case_sensitive` | `search(case_sensitive=True)` |
| `test_search_case_sensitive_with_list` | list + case-sensitive combo (helpers.py:124-141) |
| `test_search_invalid_column_is_ignored` | `key not in data_filter.columns` warning |
| `test_to_toolkit_raises_without_financetoolkit` | `to_toolkit` ImportError path (mocked) |
| `test_to_toolkit_success_path` | `to_toolkit` success path (mocked Toolkit, helpers.py:236-265) |
| `test_init_raises_on_request_failure` | `__init__` `RequestException` path (mocked) |
| `test_module_show_options_raises_on_invalid_selection` | module-level `show_options` `ValueError` paths |
| `test_module_show_options_raises_on_request_failure` | module-level `show_options` network failure (mocked) |
| `test_module_show_options_local_location` | module-level `show_options(use_local_location=True)` (helpers.py:320, 336-341) |

In every asset-class `tests/test_<asset>.py` (cryptos, currencies,
equities, etfs, funds, indices, moneymarkets):

| New test | Branch covered |
|---|---|
| `test_select_with_invalid_value_raises` | `select(<filter>=...)` ValueError for unknown values — covers the validate-and-raise blocks unique to each asset class (e.g. `ETFs.py:83, 96, 109, 115-126`) |

The asset-class test exercises every filter column declared on that
class's `select()` signature, so adding a new filter automatically
needs a value validation when the test is updated.

## 3. Behavioural smoke assert in every asset-class `test_select`

Snapshot tests pass even when the library returns an empty DataFrame
(the snapshot, regenerated under the same broken state, would just be
empty too — and the test would pass silently). One cheap assertion per
file catches that class of regression:

```python
smoke = equities.select()
assert not smoke.empty
assert "country" in smoke.columns
```

Applied to all 7 asset-class test files.

## 4. Coverage reporting in CI

`pytest-cov` was already in `[tool.poetry.group.dev.dependencies]` but
wasn't invoked by the workflow. Wired it in:

```diff
-                    poetry run pytest tests/
+                    poetry run pytest tests/ --cov=financedatabase --cov-report=term-missing
```

Coverage now appears in the CI log; gaps become visible without local
runs. No Codecov upload — terminal report is enough for review-time
signal; uploading to a third party would add a token + service
dependency and can be a focused follow-up.

## Coverage breakdown (before -> after)

| File | Before | After |
|---|---:|---:|
| `helpers.py` | 45% | **86%** |
| `ETFs.py` | 73% | 90% |
| `Equities.py` | 88% | 94% |
| `Funds.py` | 83% | 92% |
| `Indices.py` | 89% | 93% |
| `Cryptos.py` / `Currencies.py` / `Moneymarkets.py` | 90% | 97% |
| **Total** | **78%** | **92%** |

## What this PR deliberately does NOT do (still deferred)

- `pytest-xdist` parallel execution. Tried it (`-n auto`); the
  conftest's compression-regen at import time runs once per worker,
  racing on the same files. For a ~25s suite the per-worker import
  overhead actually slows the wall time. The fix is non-trivial
  (coordinate regen across workers via a lock file) and not worth doing
  for a suite this small.
- `@pytest.mark.parametrize` / class-based grouping — would change
  pytest node IDs and break the snapshot positional naming convention.

## Test plan

- [ ] `pytest tests/` — 49 tests pass (was 32 before this PR)
- [ ] CI coverage line appears in the run log
- [ ] `black --check tests/` clean
- [ ] No snapshot files modified — only test code, config, and the
      `.github/workflows/testing.yml` step
dokson added a commit to dokson/FinanceDatabase that referenced this pull request May 19, 2026
…ME stats

Per maintainer feedback on the initial version, this PR drops the
`country` column addition for ETFs/Funds — the country semantics for
those asset classes (investment scope, not listing-exchange country)
need a richer signal than `exchange -> country` can give. What
remains is still substantial:

## 1. Data-quality fixes on `etfs.csv` (surfaced by the new invariant)

- **14 non-ETF rows removed**: `^REIT` plus 13 US-equity duplicates
  that already existed correctly in `equities.csv` (`BHF`, `BHFAN`,
  `BHFAO`, `BHFAP`, `DTB`, `DTE`, `DTP`, `HTGC`, `PBC`, `PSEC`, `RGA`,
  `RZA`, `TPVG`).
- **56 cross-asset symbol collisions removed** — all 56 were corporate
  bonds / senior notes / equity share-class rows misclassified as ETFs
  (Brighthouse, Corvus Gold, Great Ajax Corp. notes, Argo notes,
  CMS Energy junior subordinated notes, Conifer Holdings senior notes,
  Qwest Corp notes, DTE Energy variants, `ASGI` = Aberdeen Global
  Infrastructure Income Fund, etc.).
- **29 ETF rows with corrupted `exchange` values fixed** — issuer name
  written into the exchange column instead of a real exchange code
  (`Xtrackers`, `Fundlogic`, `Purpose Investments`, `CI Investments`,
  `Horizons ETFs Management`, `Harvest Portfolios Group`,
  `IA Clarington Investments`, `National Bank Investments`,
  `Caldwell Investment Management`, `Developed Markets`,
  `Emerging Markets`, `High Yield Bonds`). Re-derived from ticker
  suffix.
- **`FSST` completed** (Fidelity Sustainable U.S. Equity ETF) — row
  previously had every field NaN. Filled name, currency, summary,
  category_group, category, family, exchange (`PCX`), isin.

## 2. New cross-asset invariant test

`tests/test_invariants.py::test_no_symbol_collisions_across_asset_classes`
asserts that a given `symbol` belongs to at most one of `equities.csv`,
`etfs.csv`, `funds.csv`. This is the test that surfaced the 56 + 14 = 70
cleanup rows above; it would have caught all of them before they
landed on `main`.

## 3. Automated README statistics

`.github/workflows/database_update.yml` gains a new
`Update-README-Statistics` job that runs after the existing
`Add-New-Ticker` / `Update-Compression-Files` / `Update-Categorization-Files`
jobs. It recomputes the statistics tables from the on-disk CSVs and
rewrites `README.md` in-place, committing the result. The
`Check-GICS-Categorisation` job now also depends on it.

README is now split into three tables that reflect the actual schema
of each asset class rather than the previous combined-table layout
whose `Countries` column for ETFs/Funds had no backing data:

```
Table A — Equities only (has country, sector, industry)
| Product  | Quantity | Sectors | Industries | Countries | Exchanges |
| Equities | 160.113  | 11      | 62         | 113       | 84        |

Table B — ETFs / Funds (no country: no schema for it)
| Product | Quantity | Families | Categories | Exchanges |
| ETFs    | 36.485   | 320      | 51         | 51        |
| Funds   | 57.853   | 1.540    | 74         | 33        |

Table C — Currencies / Cryptos / Indices / Money Markets
| Product           | Quantity | Category              |
| Currencies        | 2.556    | 175 Currencies        |
| Cryptocurrencies  | 3.367    | 351 Cryptocurrencies  |
| Indices           | 91.178   | 63 Exchanges          |
| Money Markets     | 1.367    | 2 Exchanges           |
```

The B-table columns are renamed from the previous "Sectors / Industries"
(which were really family/category counts) to be honest about what the
numbers represent.

## What this PR explicitly does NOT do

- **No `country` column on `etfs.csv` / `funds.csv`.** Per
  @JerBouma's feedback, "country" for ETFs/Funds should reflect the
  investment scope (e.g. iShares MSCI World ETF listed on NYSE has
  global scope, not "United States"), not the listing exchange. The
  obvious deterministic source (exchange -> country) doesn't fit that
  semantics. Deferred to a focused discussion / future PR if the
  community decides on a definition.
- **No country fill on missing equities.** 80k equity rows still have
  empty `country`; filling them from listing exchange would be wrong
  for the same reason ASML on Nasdaq stays "Netherlands". Needs an
  actual issuer-domicile source.

## Test plan

- [ ] `pytest tests/` — 50 tests pass (was 49 before; the new
      `test_no_symbol_collisions_across_asset_classes` is the +1)
- [ ] `test_no_symbol_collisions_across_asset_classes` passes
- [ ] `black --check tests/ financedatabase/` clean
- [ ] CI `Update-README-Statistics` job rewrites README correctly on
      next data update
- [ ] Spot-check: `BHF`, `DTE`, `RGA`, etc. exist in `equities.csv`
      only (not in `etfs.csv` anymore)
- [ ] Spot-check: `FSST` has all fields populated

Related: JerBouma#140 (test infra), JerBouma#143 (introduced the `exchange -> market`
invariant pattern), JerBouma#144 (proposes splitting CSVs by exchange).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants