Skip to content

[vm][gas] Optimize type creation gas charging#19306

Closed
georgemitenkov wants to merge 3 commits into
mainfrom
george/ty-costs
Closed

[vm][gas] Optimize type creation gas charging#19306
georgemitenkov wants to merge 3 commits into
mainfrom
george/ty-costs

Conversation

@georgemitenkov
Copy link
Copy Markdown
Contributor

@georgemitenkov georgemitenkov commented Apr 2, 2026

Description

charge_create_ty charges gas for type instantiation every time a generic instruction executes. But actual type substitution only happens on the first execution per instruction index within a frame — subsequent executions serve cached results from FrameTypeCache. In loops, this overcharges by a factor of N (loop iterations). Similarly, non-generic function frame creation re-charged local type counts on every call despite types never changing.

This PR has three commits:

1. Charge type creation on cache miss only — Adds a charge_create_ty_on_cache_hit flag to VMConfig, gated on gas feature version (< RELEASE_V1_45). For older versions, behavior is identical: gas is charged on every execution, per-field. For newer versions, gas is only charged on cache miss and uses aggregated sums instead of per-field loops. Refactors FrameTypeCache with struct-of-arrays storage and purpose-specific getters (types-only for runtime checks, counts for gas).

2. Make function types use arcs — Wraps Type::Function args and results in TriompheArc<Vec<Type>> so that Type::clone() is O(1) for all variants that can appear as struct fields. Without this, runtime type checks that clone cached types every loop iteration could be exploited for uncharged work with function-typed fields.

3. Charge non-generic frame local types once — Non-generic functions were charging charge_create_ty for every local on every call, even when the frame cache was reused. Now both generic and non-generic paths cache local_ty_counts and skip charging on cache hit (or charge for compatibility on older gas versions).

Behavior change

  • Old gas version (< RELEASE_V1_45): no change, identical gas semantics
  • New gas version (>= RELEASE_V1_45): generic instructions in loops and repeated function calls pay type creation gas once instead of every iteration/call

How Has This Been Tested?

  • cargo check -p move-vm-runtime
  • cargo check -p move-vm-types
  • Replay verification pending

Key Areas to Review

  • frame_type_cache.rs: new cache API — Option-based returns encode miss/hit for gas, flag controls behavior
  • interpreter.rs: ~20 call sites updated to use new getters
  • frame.rs: unified local type charging for generic and non-generic functions
  • prod_configs.rs: feature gating on RELEASE_V1_45

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Skip type creation gas charges when the type is already cached in the
frame type cache, gated by gas feature version (< V1_45 retains old
behavior). This significantly reduces gas costs for repeated generic
function calls, e.g., in loops. Also refactors FrameTypeCache to use
dedicated structs and separate type-only vs count-only accessors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Wrap Type::Function args and results in TriompheArc<Vec<Type>> so that
cloning function types is O(1) (atomic ref-count increment) instead of
O(n) (deep copy of both Vec<Type>). This is a prerequisite for safely
charging type creation gas only on cache miss: without O(1) clones,
runtime type checks that clone struct field types every loop iteration
could be exploited for uncharged work with function-typed fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@georgemitenkov georgemitenkov changed the title [vm][gas] Charge type creation on cache hit [vm][gas] Optimize type creation gas charging Apr 2, 2026
Non-generic functions were charging charge_create_ty for every local on
every call, even when the frame cache was reused across calls. Generic
functions already cached local_ty_counts and skipped on cache hit. Apply
the same pattern to non-generic functions so local type node counts are
computed and charged once, then skipped on subsequent calls (or charged
for compatibility on older gas versions).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@georgemitenkov
Copy link
Copy Markdown
Contributor Author

@security-bug-bot

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differential Security Review — PR #19306: [vm][gas] Optimize type creation gas charging

Date: 2026-04-07 | Reviewer: Automated security review

Executive Summary

Severity Count
HIGH 2
MEDIUM 2
LOW 1

Overall risk: The optimization logic is structurally sound, but two independent semantics gaps introduce potential for gas undercharging and defensive-check degradation across generic instructions on any gas version. The absence of deterministic unit tests for the new cache-hit gas path is itself an elevated risk.

Recommendation: REVIEW BEFORE MERGE


[HIGH] Finding 1: VecPack bypasses the new gas-charging API — reads private field directly

File: third_party/move/move-vm/runtime/src/interpreter.rs (VecPack handler)

Every other vector instruction (VecLen, VecImmBorrow, VecMutBorrow, VecPushBack, VecPopBack, VecUnpack, VecSwap) delegates the charge-on-cache-hit decision to get_signature_index_type_count, which encapsulates charge_create_ty_on_cache_hit || is_miss. VecPack alone reads frame_cache.charge_create_ty_on_cache_hit directly and reimplements the same logic inline.

Today the values match. The risk is structural: if get_signature_index_type_count's policy ever changes (e.g., per-instruction overrides), VecPack will silently diverge from all other vector instructions. The private-field read also bypasses the single authoritative API designed to govern this decision.

Alignment factors:

  1. SPLIT-STAGE INCONSISTENCY: VecPack uses a different code path from every other vector instruction for the identical charge decision.
  2. IMPLICIT UPSTREAM ASSUMPTION: The API contract of get_signature_index_type_count is not enforced for VecPack.

Impact: Defense-in-depth degradation; latent divergence vector if gas policy evolves.


[HIGH] Finding 2: UnpackGeneric field-count source changed — independent cross-check weakened

File: third_party/move/move-vm/runtime/src/interpreter.rs (UnpackGeneric handler)

In the old code, num_expected_fields for the post-unpack stack-count integrity check came from ty_and_field_counts.len() — the length of the dynamically instantiated field-type array. This guaranteed that the expected-count check used the same data source as the instantiation itself.

In the new code, num_expected_fields comes from self.field_instantiation_count(*si_idx), which reads struct_instantiations[idx].field_count — a value stored at module load time. The field-type data still comes from instantiate_generic_struct_fields (via the cache). These two derivations are logically consistent but are now independent paths.

If they ever diverge (e.g., loader bug, instantiation edge case), the num_expected_fields != num_actual_fields integrity check would fail to catch the discrepancy it was designed to detect. The old code had zero distance between the count source and the data being counted.

Impact: Removes a defense-in-depth layer from the UnpackGeneric integrity check.


[MEDIUM] Finding 3: counts_sum aggregation changes gas granularity — linearity assumption undocumented

File: third_party/move/move-vm/runtime/src/frame_type_cache.rs (CachedFieldAndStructType::new)

Under the old gas version, ImmBorrowFieldGeneric, MutBorrowFieldGeneric, and their variant equivalents make two separate gas_meter.charge_create_ty(...) calls (one for struct type, one for field type). Under the new gas version (cache miss), a single call gas_meter.charge_create_ty(field_count + struct_count) is made.

These are equivalent only if charge_create_ty is a linear function with no per-call overhead. The optimization implicitly assumes charge_create_ty(a) + charge_create_ty(b) == charge_create_ty(a + b). This assumption is not documented anywhere in the code.

If a future gas schedule introduces a per-call overhead in charge_create_ty, generic field borrow instructions will be systematically undercharged under the new gas version. The GasQuantity::add uses saturating_add, so the sum itself is safe from overflow — the concern is the semantic change from N calls with individual values to 1 call with their aggregate.

Impact: Potential undercharge if gas meter linearity assumption is violated by future schedule changes.


[MEDIUM] Finding 4: Non-generic function local_ty_counts cache persists across invocations when enable_function_caches = true

File: third_party/move/move-vm/runtime/src/frame.rs (Frame::make_new_frame, non-generic path)

When enable_function_caches = true, InterpreterFunctionCaches retains FrameTypeCache instances keyed by function pointer. The new code for non-generic functions now stores local_ty_counts in this shared cache. Under gas version >= RELEASE_V1_45 (charge_create_ty_on_cache_hit = false), every invocation of a non-generic function after the first pays zero local-type-creation gas, across any call boundary that shares the same InterpreterFunctionCaches session.

The PR description targets "repeated calls to generic functions in a loop" but the non-generic optimization operates at the cache level, not the frame level. It is not clearly documented whether this zero-charge-on-reuse behavior is intentional for non-generic functions called across separate transactions within the same session.

Impact: Potential gas undercharge for repeated non-generic function calls within a shared interpreter session.


[LOW] Finding 5: Doc comment typo in VMConfig

File: third_party/move/move-vm/runtime/src/config.rs

The doc comment for charge_create_ty_on_cache_hit reads ///making it much cheaper — missing a space after ///.


Test Coverage Gap

The PR description notes "replay verification pending." There are no deterministic unit tests covering:

  • Cache-hit path for any new get_*_counts_and_sum / get_*_type_and_count getter under the new gas version
  • VecPack in a loop to verify charge is paid exactly once
  • Non-generic function called twice within the same session to verify local type charge behavior

The absence of these tests means any regression in the gas-charging optimizations will go undetected until replay verification.


Recommendations

Before merge:

  1. Refactor VecPack to call get_signature_index_type_count instead of reading charge_create_ty_on_cache_hit directly — enforce the API contract.
  2. Restore num_expected_fields = <field_types>.len() for UnpackGeneric, or add a debug assertion that field_instantiation_count equals instantiate_generic_struct_fields(...).len().
  3. Document the linearity assumption for counts_sum charging (Finding 3), or replace with two separate calls as in the legacy path.
  4. Clarify and test cross-invocation behavior for non-generic function local_ty_counts caching (Finding 4).
  5. Add unit tests for cache-miss/hit paths under the new gas version before completing replay verification.
Open in Web View Automation 

Sent by Cursor Automation: Security Review Bot

)?;
}
gas_meter.charge_create_ty(ty_count)?;
if charge_create_ty_on_cache_hit || is_miss {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH - Finding 1] VecPack reads frame_cache.charge_create_ty_on_cache_hit directly instead of calling get_signature_index_type_count. Every other vector instruction (VecLen, VecImmBorrow, VecMutBorrow, VecPushBack, VecPopBack, VecUnpack, VecSwap) delegates to get_signature_index_type_count, which is the single authoritative API for the charge_create_ty_on_cache_hit || is_miss decision. VecPack bypasses it.

Today the result is identical, but if get_signature_index_type_count's policy ever changes, VecPack will silently diverge. Consider refactoring to:

if let Some(count) = frame_cache.get_signature_index_type_count(*si, self)? {
    gas_meter.charge_create_ty(count)?;
}

(retaining the separate get_signature_index_type_and_count_and_depth call only for the depth check and the ty reference needed for Vector::pack).

interpreter.operand_stack.push(value)?;
}

let num_expected_fields = self.field_instantiation_count(*si_idx) as usize;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH - Finding 2] num_expected_fields now comes from field_instantiation_count(*si_idx) (static module load-time value) rather than from the length of the dynamically instantiated field-type array (ty_and_field_counts.len() in the old code).

The old source ensured the post-unpack stack-count check used the same data that drove the unpack itself. These two derivations are logically equivalent, but they are now independent code paths. If they ever diverge (e.g., a loader edge case), the integrity check will silently fail to catch the discrepancy it was designed to detect. Consider restoring the original source, or adding a debug assertion:

debug_assert_eq!(
    self.field_instantiation_count(*si_idx) as usize,
    /* cached field types len */
);

fn new(field_ty: Type, struct_ty: Type) -> Self {
let field_count = NumTypeNodes::new(field_ty.num_nodes() as u64);
let struct_count = NumTypeNodes::new(struct_ty.num_nodes() as u64);
let counts_sum = field_count + struct_count;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM - Finding 3] counts_sum = field_count + struct_count is used as a single aggregated charge_create_ty call on cache miss under the new gas version. This is numerically equivalent to two separate calls only if charge_create_ty is a linear function with no per-call overhead (i.e., charge(a) + charge(b) == charge(a + b)).

This linearity assumption is not documented. If a future gas schedule adds a per-call component to charge_create_ty, generic field borrow instructions will be systematically undercharged. Please add a comment here documenting the assumption, e.g.:

// Aggregating field_count + struct_count is valid only if charge_create_ty is purely
// linear (no per-call overhead). If the gas schedule changes, revisit this.
let counts_sum = field_count + struct_count;

for cnt in local_ty_counts.iter() {
gas_meter.charge_create_ty(*cnt)?;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM - Finding 4] When enable_function_caches = true, this FrameTypeCache (and its local_ty_counts) is shared across multiple invocations of the same non-generic function — potentially across transaction boundaries within the same session.

Under the new gas version (charge_create_ty_on_cache_hit = false), every invocation after the first will pay zero local-type-creation gas. The PR description targets "repeated calls in a loop" as the intended case, but does not explicitly address cross-transaction reuse. Please clarify: is zero-charge-on-reuse for non-generic functions intentional across transaction boundaries, or should it be limited to within-loop / within-frame reuse?

@@ -76,6 +76,11 @@ pub struct VMConfig {
/// When enabled, non-private structs and enums with the copy ability can be used as
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW - Finding 5] Missing space in doc comment: ///making it much cheaper should be /// making it much cheaper.

@georgemitenkov georgemitenkov deleted the george/ty-costs branch April 23, 2026 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant