Skip to content

fix(complexity): count what the visitor claims to count#6

Open
GrigoryEvko wants to merge 2 commits into
FusionBrainLab:mainfrom
GrigoryEvko:fix/complexity-visitor-coverage
Open

fix(complexity): count what the visitor claims to count#6
GrigoryEvko wants to merge 2 commits into
FusionBrainLab:mainfrom
GrigoryEvko:fix/complexity-visitor-coverage

Conversation

@GrigoryEvko
Copy link
Copy Markdown

NumericalComplexityVisitor had three correctness gaps: (a) total_nodes was a sum of 7 specific visitor counts and under-reported by ~6× — use len(ast.walk(tree)) instead; (b) visit_Match and visit_AsyncFor were missing, so programs using match or async for reported condition_count=0 / loop_count=0; (c) comprehensions, IfExp, Lambda, try/except, and except* were not counted at all.

Reproducer against current main:

from gigaevo.programs.stages.complexity import compute_numerical_complexity

print(compute_numerical_complexity("def f(): x=1+2; return x")["total_nodes"])
# before: 2;  after: 13

r = compute_numerical_complexity("[v for v in xs if v > 0]")
print(r["loop_count"], r["condition_count"])
# before: 0 0;  after: 1 1

Fix: use len(ast.walk(tree)) for total_nodes; add visit_Match (each case is a condition), visit_AsyncFor (loop), comprehension visitors (each generator = loop, each if clause = condition), visit_IfExp / visit_Lambda / visit_Try / visit_TryStar. Verified against an independent ast.walk-based reference on 9 small samples and 7 large production files from torch + transformers (4k–10k LOC); all five counts match across both implementations. One pre-existing test asserted the buggy sum-of-counts as expected output and is updated.

…cFor

total_nodes was computed as the sum of 7 specific visitor counts, severely under-reporting (`def f(): x=1+2; return x` returned 2 vs the real 13). Use len(ast.walk(tree)) instead — the value was already computed above for entropy.

Add visit_Match (each case branch is a discrete condition, mirroring if/elif) and visit_AsyncFor (counted as a loop, mirroring for/while). Without these, modern Python code reported condition_count=0 / loop_count=0.

One pre-existing test asserted the buggy sum-of-counts as expected output and is updated.
Comprehensions (list/set/dict/generator) are loops in disguise: each `for` generator contributes a loop and each `if` clause contributes a condition. `IfExp` (ternary `a if b else c`) is a branch decision. `Lambda` is an anonymous function definition. Each `except` handler in `try` / `except*` is a branch path.

Verified against an independent ast.walk-based reference on 9 small samples and 7 production files from torch + transformers (4k-10k LOC each); all five counts match across both implementations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant