Skip to content

Cache expensive introspection calls in pyi_generator for faster stub generation#6187

Merged
adhami3310 merged 5 commits intoreflex-dev:mainfrom
FarhanAliRaza:optimize-pyi-gen
Mar 18, 2026
Merged

Cache expensive introspection calls in pyi_generator for faster stub generation#6187
adhami3310 merged 5 commits intoreflex-dev:mainfrom
FarhanAliRaza:optimize-pyi-gen

Conversation

@FarhanAliRaza
Copy link
Copy Markdown
Collaborator

@FarhanAliRaza FarhanAliRaza commented Mar 17, 2026

Extract repeated inspect.getsource, getfullargspec, inspect.signature, and module import calls into @cache-decorated helper functions. Replace inline exec() calls with explicit dict updates for resolving type hints. Convert all_props from list to set for O(1) membership checks.
/usr/bin/time -p uv run python3 -m reflex.utils.pyi_generator
Before:
8 files reformatted, 112 files left unchanged
real 4.12
user 5.37
sys 0.33
After:
8 files reformatted, 112 files left unchanged
real 1.93
user 3.00
sys 0.32

After multiprocess forking:
117 files reformatted, 3 files left unchanged
real 1.57
user 3.86
sys 0.67

All Submissions:

  • Have you followed the guidelines stated in CONTRIBUTING.md file?
  • Have you checked to ensure there aren't any other open Pull Requests for the desired changed?

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

New Feature Submission:

  • Does your submission pass the tests?
  • Have you linted your code locally prior to submission?

Changes To Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you successfully ran tests with your changes locally?

closes #6183

…generation

Extract repeated inspect.getsource, getfullargspec, inspect.signature,
and module import calls into @cache-decorated helper functions. Replace
inline exec() calls with explicit dict updates for resolving type hints.
Convert all_props from list to set for O(1) membership checks.
Before:
8 files reformatted, 112 files left unchanged
real 4.12
user 5.37
sys 0.33
After:
8 files reformatted, 112 files left unchanged
real 1.93
user 3.00
sys 0.32
@FarhanAliRaza FarhanAliRaza marked this pull request as draft March 17, 2026 18:26
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 17, 2026

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing FarhanAliRaza:optimize-pyi-gen (040a0d7) with main (fe34ae8)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR optimizes pyi_generator.py by caching expensive introspection calls and introducing optional fork-based parallelism, cutting stub generation time roughly in half (4.1s → 1.9s).

Key changes:

  • Introduces seven @cache-decorated helper functions (_get_source, _get_class_prop_comments, _get_full_argspec, _get_signature_return_annotation, _get_module_star_imports, _get_module_selected_imports, _get_class_annotation_globals, _get_class_event_triggers) to memoize repeated inspect.* / importlib.import_module calls.
  • Replaces exec(f"from X import *", type_hint_globals) with explicit type_hint_globals.update(_get_module_star_imports(...)) calls, eliminating dynamic code execution.
  • Returns MappingProxyType / frozenset from cached helpers so the cache objects cannot be accidentally mutated by callers.
  • Converts all_props from list to set for O(1) membership checks.
  • Introduces ProcessPoolExecutor with a fork context on platforms that support it, pre-importing all target modules in the main process so workers inherit a warm sys.modules cache.
  • Promotes _write_pyi_file, _get_init_lazy_imports, and _scan_file from PyiGenerator methods to module-level functions (required so they can be pickled for use in worker processes).
  • Removes the dead if True: ... else: _scan_files_multiprocess(...) branch and the redundant first ruff format invocation.
  • _get_class_event_triggers is newly cached and used in _extract_class_props_as_ast_nodes, but _generate_component_create_functiondef still calls clz.get_event_triggers() directly — a minor inconsistency.

Confidence Score: 4/5

  • Safe to merge — changes are confined to a developer-tooling script with no runtime impact on the library itself.
  • All logic changes are equivalent rewrites of the original code (exec → dict.update, list → set, method → module-level function). The caching is safe because immutable types are returned. The parallel path is guarded by a fork availability check. The only substantive issues are a missed cache usage in _generate_component_create_functiondef and a missing docstring, both of which are style/optimization concerns rather than correctness bugs.
  • No files require special attention beyond the minor inconsistency noted in _generate_component_create_functiondef.

Important Files Changed

Filename Overview
reflex/utils/pyi_generator.py Major refactor: introduces @cache-decorated helper functions to memoize expensive introspection calls, replaces exec()-based import injection with explicit dict.update() calls, converts all_props to a set, adds fork-based ProcessPoolExecutor parallelism, and promotes several PyiGenerator methods to module-level functions. Logic is generally correct but has a minor inconsistency with an uncached get_event_triggers() call and missing docstring on _write_pyi_file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PyiGenerator.scan_all] --> B[_scan_files]
    B --> C{fork available\nand max_workers > 1?}

    C -- No --> D[Serial loop\nover files]
    D --> E[_scan_file per file]

    C -- Yes --> F[Pre-import all modules\nin main process]
    F --> G[ProcessPoolExecutor\nfork context]
    G --> H[_scan_file per file\nin worker processes]

    E --> I{is_init_file?}
    H --> I

    I -- Yes --> J[InitStubGenerator.visit\nast.parse _get_source module]
    I -- No --> K[StubGenerator.visit\nast.parse _get_source module]

    J --> L[_get_init_lazy_imports]
    K --> M[_write_pyi_file]
    L --> M

    subgraph Cached Helpers
        N[_get_source]
        O[_get_full_argspec]
        P[_get_signature_return_annotation]
        Q[_get_module_star_imports]
        R[_get_module_selected_imports]
        S[_get_class_annotation_globals]
        T[_get_class_event_triggers]
        U[_get_class_prop_comments]
        V[_get_parent_imports]
    end

    K -->|uses| N
    K -->|uses| O
    K -->|uses| P
    K -->|uses| Q
    K -->|uses| R
    K -->|uses| S
    K -->|uses| T
    K -->|uses| U
    K -->|uses| V
Loading

Last reviewed commit: 040a0d7

Comment on lines +609 to +625
@cache
def _get_parent_imports(func: Callable) -> dict[str, tuple[str, ...]]:
imports_ = {"reflex.vars": {"Var"}}
module_dir = set(importlib.import_module(func.__module__).__dir__())
for type_hint in inspect.get_annotations(func).values():
try:
match = re.match(r"\w+\[([\w\d]+)\]", type_hint)
except TypeError:
continue
if match:
type_hint = match.group(1)
if type_hint in importlib.import_module(func.__module__).__dir__():
imports_.setdefault(func.__module__, []).append(type_hint)
return imports_
if type_hint in module_dir:
imports_.setdefault(func.__module__, set()).add(type_hint)
return {
module_name: tuple(sorted(imported_names))
for module_name, imported_names in imports_.items()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Return type annotation conflicts with internal implementation

The declared return type is dict[str, tuple[str, ...]] and the final return statement does produce that type. However, the function first builds imports_ with set values:

imports_ = {"reflex.vars": {"Var"}}                    # value is a set
imports_.setdefault(func.__module__, set()).add(...)   # value is a set

Python's type checker will flag imports_ as dict[str, set[str]] throughout the function body, producing a mismatch between the internal variable type and the final dict comprehension that converts to tuples. Consider using an explicit intermediate type annotation to keep this clear:

imports_: dict[str, set[str]] = {"reflex.vars": {"Var"}}

Pre-import modules sequentially to populate sys.modules, then fork
worker processes for AST parsing, transformation, and file writing.
Extract _write_pyi_file, _get_init_lazy_imports, and _scan_file to
module-level functions. Remove disabled _scan_files_multiprocess.

Reduces generation time from ~1.9s to ~1.45s (~24% faster).
Cached dicts returned by _get_module_star_imports, _get_module_selected_imports,
_get_class_annotation_globals, and _get_parent_imports are shared across callers.
Wrap them in MappingProxyType to prevent accidental mutation of cached values.
Also remove redundant first ruff format pass.
117 files reformatted, 3 files left unchanged
real 1.57
user 3.93
sys 0.67
@FarhanAliRaza
Copy link
Copy Markdown
Collaborator Author

Parallelization on my machine is 1.4 vs 1.9 seconds. I don't know if the complexity is worth it and windows is not gonna support it because fork is not supported, and spawn is too slow.

@masenf
Copy link
Copy Markdown
Collaborator

masenf commented Mar 17, 2026

i think the half second is worth it, and windows is already slow so...

@FarhanAliRaza
Copy link
Copy Markdown
Collaborator Author

image

Now everything is somewhat balanced.

@FarhanAliRaza FarhanAliRaza marked this pull request as ready for review March 17, 2026 20:45
Copy link
Copy Markdown
Collaborator

@masenf masenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@adhami3310 adhami3310 merged commit d45a1bb into reflex-dev:main Mar 18, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate and improve pyi_generator run time

3 participants