Skip to content

Refactor to improve performance/torch compile compatibility#2

Merged
LemonPi merged 3 commits into
masterfrom
refactor/torch-compile
Mar 10, 2026
Merged

Refactor to improve performance/torch compile compatibility#2
LemonPi merged 3 commits into
masterfrom
refactor/torch-compile

Conversation

@LemonPi
Copy link
Copy Markdown
Member

@LemonPi LemonPi commented Mar 10, 2026

No description provided.

LemonPi and others added 3 commits March 10, 2026 13:30
- Add 15 new tests across math, linalg, tensor_utils, preprocessor, softknn
- Add torch.compile verification tests (test_compile.py) tracking compilability
- Add benchmark script (benchmarks/bench_compile.py) with eager + compile timing
- Add pytest-benchmark to test dependencies
- Replace commented-out softknn tests with proper assertion-based tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- replace_nan_and_inf: use torch.nan_to_num (~3.9x faster)
- angular_diff_batch: use modulo wrapping (~1.4x faster, fixes correctness for large diffs)
- angle_between_stable: use broadcasting instead of .repeat() (~1.15x faster)
- sqrtm: replace removed np.float_ alias with np.float64
- bench_compile: remove lambda wrappers that broke torch.compile tracing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- kronecker_product: replace manual implementation with torch.kron
- GELS: replace deprecated torch.cholesky with torch.linalg.cholesky
- Lookahead optimizer: fix deprecated add_(scalar, tensor) signature
- ls_cov: use torch.linalg.lstsq for params, torch.linalg.solve
  instead of explicit .inverse() for better numerical stability
- StandardScaler: precompute reciprocal to multiply instead of divide
- Bump version 0.4.3 → 0.5.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@LemonPi LemonPi merged commit de16d62 into master Mar 10, 2026
3 checks passed
@LemonPi LemonPi deleted the refactor/torch-compile branch March 10, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant