Early Return Pipeline#545
Conversation
|
Looking at benchmarks/test_galley.py, this PR solves half of the problem. It allows for early return, but it is still a bit inherently complicated to run the first few pipeline stages, intercept the middle, and then benchmark only the remainder. This is because LogicExecutor does things both before and after it calls its child ctx. |
Merging this PR will degrade performance by 64.71%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Memory | test_galley_matmul_chain[empty_last-optimize] |
54.8 KB | 2,447 KB | -97.76% |
| ❌ | WallTime | test_galley_matmul_chain[empty_last-optimize] |
249.5 ms | 5,048.6 ms | -95.06% |
| ❌ | WallTime | test_ops_binary[interpret_galley-multiply] |
786.2 ms | 928.5 ms | -15.33% |
| ❌ | WallTime | test_ops_binary[interpret_galley-add] |
787 ms | 926 ms | -15.01% |
| ❌ | WallTime | test_ops_reduction[interpret_galley-sum] |
419.5 ms | 485.6 ms | -13.61% |
| ❌ | WallTime | test_ops_binary[interpret_galley-matmul] |
58.9 s | 65.7 s | -10.48% |
| ⚡ | Memory | test_galley_matmul_chain[dense_last-optimize] |
2.6 MB | 2.4 MB | +10.74% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing kbd-easier-pipeline-return (5216789) with main (64775fa)
This PR adds:
An early return pipeline can then look like, e.g. :