Skip to content

Latest commit

 

History

History
180 lines (128 loc) · 5.96 KB

File metadata and controls

180 lines (128 loc) · 5.96 KB

Contributing to Elvers

Pre-1.0 development. API is stabilizing but may change between minor versions. Operator coverage and numerical conventions are under active refinement.


Getting Started

# Fork on GitHub, then:
git clone https://github.com/<your-username>/elvers.git
cd elvers
git remote add upstream https://github.com/quantbai/elvers.git
git checkout dev
pip install -e ".[dev]"
pre-commit install

Verify:

pytest tests/ -v
ruff check elvers/
pyright elvers/

Workflow

Branches

main (protected)  -- tagged releases only
  dev             -- integration, CI must pass
    feature/XXX   -- new operators or features
    fix/XXX       -- bug fixes

All changes enter through pull requests to dev.

Development Cycle

git checkout dev && git pull upstream dev       # 1. sync
git checkout -b feature/ts-entropy              # 2. branch
# ... implement ...
ruff check elvers/ && pyright elvers/ && pytest tests/ -v  # 3. verify
git add <specific files>                        # 4. stage (never git add -A)
git commit -m "feat(ops): add ts_entropy"       # 5. commit
git push -u origin feature/ts-entropy           # 6. push
# 7. open PR -> dev

Adding a New Operator

1. Choose the Module

Prefix File Scope
ts_ ops/timeseries.py Per-symbol rolling window
(none) ops/cross_sectional.py Across symbols per timestamp
group_ ops/neutralization.py Within groups per timestamp
(none) ops/math.py Element-wise math
(none) ops/base.py Arithmetic and structural

2. Implement

Adhere to the numerical invariants below and follow existing patterns in the target module.

3. Export

Add to elvers/ops/__init__.py: import line and __all__ list.

4. Test

Add tests in tests/test_<module>.py:

  • Correctness: verify against expected values (pytest.approx). For complex operators (regression, covariance, decay), cross-validate against NumPy or SciPy on randomized inputs.
  • Null propagation: null inputs produce correct null outputs
  • Edge cases: constant series, all-null, window > data length, zero denominators

5. Document

Add entry to OPERATORS.md.

6. Pre-PR Checklist

  • ruff check elvers/ passes
  • pyright elvers/ passes
  • pytest tests/ -v passes (full suite)
  • Operator in __init__.py exports and __all__
  • Docstring: description, parameters, return type, null behavior, warmup
  • Divisions handle zero denominators (exact zero check or Inf → null via Factor)
  • No Python loops in computation
  • Tests cover: correctness, null, edge cases
  • OPERATORS.md updated

Numerical Invariants

Elvers maintains the following invariants.

  • NaN and Inf are unified to null. Single missing-value type throughout.
  • Pure division (divide, inverse): no guard. Inf → null via Panel._add_col.
  • Population statistics: ddof=0 for std, variance, covariance.
  • Rank range: (0, 1]. Ties: average. Zero excluded.
  • Rolling warmup: first window - 1 values per symbol are null.

Full conventions: OPERATORS.md.

Design Rationale

Decision Rationale Trade-off
NaN/Inf -> null Eliminates NaN infection (NaN + 1 = NaN). One missing-value type simplifies null-propagation logic. Loses the distinction between missing data and computation overflow. Acceptable for daily/hourly factor research where Inf indicates a defect, not a signal. May need revisiting for tick-level microstructure data.
ddof=0 Rolling windows observe the full population within the window, not a sample. Avoids n=1 division-by-zero. ddof=1 would be appropriate for random samples. Deterministic lookbacks are not random samples.
Rank (0, 1] Zero is ambiguous (missing or lowest?). Strictly positive values are distinguishable from null. Downstream code cannot assume [0, 1] range.
Division by zero Pure divisions (divide, inverse) have no guard; Inf → null via Factor constructor. Statistical operators (zscore, scale, etc.) check for exact zero denominators and return semantic defaults (e.g., 0.0 for constant series). Near-zero denominators produce large but finite values. Use winsorize/truncate to handle outliers.
ts_corr uses ddof=1 internally Polars rolling_corr(ddof=0) applies ddof only to the covariance numerator, not the variance denominator, producing values outside [-1, 1] (polars#16161). Elvers isolates this by using ddof=1, which is mathematically equivalent for correlation (ddof cancels in the ratio). Will align to ddof=0 when the upstream issue is resolved.

Style

  • Ruff for formatting and linting (line-length = 120)
  • Pyright for static type checking (zero errors required in CI)
  • English for all code, comments, commits, and documentation
  • No emoji

Commits

Format: <type>(<scope>): <description>

Type Use
feat New operator or feature
fix Bug fix
refactor No behavior change
test Tests only
perf Performance
docs Documentation
ci CI/CD

Numerical changes require a [NUMERICAL] tag in the commit body with impact description.


Pull Requests

  • One PR = one logical change
  • Target: dev branch
  • CI must pass on Python 3.10, 3.11, 3.12, 3.13

Numerical output changes require before/after comparison in the PR description.

Review Criteria

  1. Tests pass and cover the change
  2. Numerical outputs cross-validated (simple operators: known values; complex operators: NumPy/SciPy oracle)
  3. Division-by-zero handled (exact zero check or Inf -> null via Panel)
  4. No implicit Inf-to-null reliance
  5. Operator exported and documented

Versioning

MAJOR.MINOR.PATCH (SemVer).

Change Bump
Bug fix, no numerical change PATCH
Numerical output change MINOR
New operator MINOR
Breaking API change MAJOR