⚡ Bolt: Use pre-computed tuples for static iterations#346
Conversation
…n in hot loop Constructing a collection of bound methods inside `_classify_known_exceptions` incurred overhead from both list allocation and repeated bound method creation. This moves the bound methods out into the `__init__` constructor as a pre-computed tuple to enhance iteration performance. Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Reviewer's guide (collapsed on small PRs)Reviewer's GuidePrecomputes and stores the classifier’s bound classification methods as a tuple during initialization and then reuses it in the hot classification loop, plus documents this performance pattern in the Bolt guidelines. Class diagram for precomputed classification_methods tuple in classifierclassDiagram
class SemanticClassifier {
- _classification_map
- _classification_methods
+ __init__()
+ classify_thing(thing, language)
- _classify_known_exceptions(thing, language)
- _classify_by_can_be_anywhere(thing, language)
- _classify_from_token_purpose(thing, language)
- _classify_multi_tier_things(thing, language)
- _classify_from_category(thing, language)
- _classify_from_direct_connections(thing, language)
- _classify_by_cross_language_lookup(thing, language)
}
class GrammarClassificationResult {
}
SemanticClassifier ..> GrammarClassificationResult : uses
note for SemanticClassifier "_classification_methods = (_classify_known_exceptions, _classify_by_can_be_anywhere, _classify_from_token_purpose, _classify_multi_tier_things, _classify_from_category, _classify_from_direct_connections, _classify_by_cross_language_lookup)"
Flow diagram for classify_thing using precomputed method tupleflowchart TD
A[classify_thing called] --> B[Initialize results list]
B --> C[Iterate over self._classification_methods]
C --> D[Call current method with thing and language]
D --> E{classification value returned?}
E -->|No| C
E -->|Yes| F{Is instance of GrammarClassificationResult?}
F -->|Yes| G[Append classification to results]
F -->|No| H[Unpack classification tuple into results]
G --> C
H --> C
C --> I[Loop ends]
I --> J[Return results]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
🤖 Hi @bashandbone, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Storing bound methods in
__init__means subclasses that override any of the_classify_*methods will keep using the base implementation unless they rebuild_classification_methods; consider either documenting this explicitly or building the tuple dynamically in a way that respects overrides (e.g., via a helper that subclasses can extend). - Since the order of
_classification_methodscontrols classification behavior, it may be worth centralizing that ordering in a single helper or clearly documenting it to reduce the risk of future changes accidentally altering classification precedence.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Storing bound methods in `__init__` means subclasses that override any of the `_classify_*` methods will keep using the base implementation unless they rebuild `_classification_methods`; consider either documenting this explicitly or building the tuple dynamically in a way that respects overrides (e.g., via a helper that subclasses can extend).
- Since the order of `_classification_methods` controls classification behavior, it may be worth centralizing that ordering in a single helper or clearly documenting it to reduce the risk of future changes accidentally altering classification precedence.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
🤖 I'm sorry @bashandbone, but I was unable to process your request. Please see the logs for more details. |
There was a problem hiding this comment.
Pull request overview
Optimizes the grammar-based semantic node classification hot loop by hoisting a static set of classifier steps out of classify_thing() and into GrammarBasedClassifier.__init__, reducing repeated allocations during frequent classifications.
Changes:
- Precomputes
self._classification_methodsonce inGrammarBasedClassifier.__init__and iterates it inclassify_thing(). - Updates the Jules “bolt” log with the micro-optimization learning/action note.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/codeweaver/semantic/classifier.py |
Hoists the classification method iteration list into a precomputed tuple on the classifier instance. |
.jules/bolt.md |
Documents the optimization pattern (precomputing bound method collections) for future reference. |
Comments suppressed due to low confidence (1)
src/codeweaver/semantic/classifier.py:862
- The inline comment claims
GrammarClassificationResultis aNamedTuple/tuple, but it’s defined in this module as aBasedModel(Pydantic model). This is misleading given the branching here is really distinguishing a single result vs a tuple of results from some classifiers; consider updating the comment to reflect the actual types returned by classification methods.
if classification := method(thing, language):
# Check if it's a tuple but NOT a GrammarClassificationResult (which is a NamedTuple, hence a tuple)
if not isinstance(classification, GrammarClassificationResult):
results.extend(classification)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
💡 What: Pre-computed the
self._classification_methodstuple of classification methods within the__init__constructor instead of creating an inline list structure every time theclassifyloop was executed.🎯 Why: When a static collection of instance methods needs to be repeatedly iterated over, constructing the collection over and over incurs double overhead: the allocation of the list memory and the creation of the bound instance method objects themselves.
📊 Impact: Reduces object creation time and decreases list allocation processing within the AST-based Node classification hotloop, improving raw iteration evaluation speeds.
🔬 Measurement: Verified using targeted tests (
mise //:test tests/unit/engine/chunker/) to ensure no regressions during classifications.PR created automatically by Jules for task 2745804533802310033 started by @bashandbone
Summary by Sourcery
Pre-compute and reuse the classifier’s bound method collection to reduce allocation overhead in the classification hot loop and document this optimization pattern in the Bolt guidelines.
Enhancements:
Documentation: