perf: Implement Fabric-aware get/set_local_poses on FabricFrameView#5516
perf: Implement Fabric-aware get/set_local_poses on FabricFrameView#5516pv-nvidia wants to merge 12 commits into
Conversation
Greptile SummaryThis PR implements a fully Fabric-native
Confidence Score: 3/5Safe to merge for stages with translation-only parent prims; stages where any parent or child prim carries a non-unit scale will produce incorrect world/local matrices after the first set_local_poses → get_world_poses round-trip. The initial Fabric seed in _sync_fabric_from_usd_initial calls Orthonormalize() on every parent transform and then writes the parent world matrix with a hard-coded (1,1,1) scale, silently discarding any real parent scale. The child local matrix is seeded without scale for the same reason. Both sync kernels read the parent world matrix on every world↔local recomputation, so a wrong parent scale corrupts every child pose update after the first dirty-flag sync. _sync_fabric_from_usd_initial in fabric_frame_view.py — the parent-world-matrix and child-local-matrix seeding blocks. Important Files Changed
Sequence DiagramsequenceDiagram
participant C as Caller
participant FFV as FabricFrameView
participant WK as Warp Kernels
participant FA as Fabric (GPU)
participant USD as USD/UsdGeom
Note over FFV: _initialize_fabric()
FFV->>USD: get_world_poses / get_local_poses / get_scales
FFV->>USD: XformCache (parents)
FFV->>FA: compose_indexed_fabric_transforms (world+local, children)
FFV->>FA: compose_indexed_fabric_transforms (world, parents)
Note over FFV: set_world_poses()
C->>FFV: set_world_poses(positions, orientations)
FFV->>WK: compose_indexed_fabric_transforms worldMatrix[rw]
WK->>FA: write worldMatrix
FFV->>WK: update_indexed_local_matrix_from_world
WK->>FA: read worldMatrix[ro], parentWorld[ro]
WK->>FA: write localMatrix[rw]
Note over FFV: set_local_poses()
C->>FFV: set_local_poses(translations, orientations)
FFV->>WK: compose_indexed_fabric_transforms localMatrix[rw]
WK->>FA: write localMatrix
FFV->>FFV: "_world_dirty=True"
Note over FFV: get_world_poses() dirty path
C->>FFV: get_world_poses()
FFV->>WK: update_indexed_world_matrix_from_local
WK->>FA: read localMatrix[ro], parentWorld[ro]
WK->>FA: write worldMatrix[rw]
FFV->>FFV: "_world_dirty=False"
FFV->>WK: decompose_indexed_fabric_transforms
WK-->>C: (positions, orientations)
Reviews (7): Last reviewed commit: "Document isaaclab.utils.warp.fabric kern..." | Re-trigger Greptile |
There was a problem hiding this comment.
Review: Fabric-aware get/set_local_poses + IK/OSC Integration
Updated review after new commits (5b4f91e → bb778db)
New Changes in This Push
1. Task-space dynamics accessors (Major Feature)
- Added
body_link_jacobian_w,body_com_jacobian_w,mass_matrix,gravity_compensation_forcestoArticulationData - New
shift_jacobian_com_to_originWarp kernel converts PhysX COM-referenced Jacobians to link-origin - Proper cache invalidation in write paths (
write_root_*_to_sim_index,write_joint_*_to_sim_index) - Full DoF axis preserved (6 floating-base columns for floating-base assets)
2. FabricFrameView Major Rework
- Three persistent
PrimSelectioninstances (trans_ro, world_rw, local_rw) - Path-based view→fabric index mapping via
wp.indexedfabricarray - Proper parent-indexed arrays for world↔local transforms
- Per-view dirty tracking (not per-stage) to avoid cross-view corruption
- Class-level
_static_hierarchy_cachefor IFabricHierarchy handles clear_static_caches()method for test cleanup
3. Extensive Test Coverage
- Shape contract tests for Jacobian/mass matrix on fixed & floating base
test_get_jacobians_link_origin_contract: J·q̇ = body velocities ✅test_get_mass_matrix_symmetry_pd: square, symmetric, positive-definite ✅test_get_gravity_compensation_forces_static_equilibrium: τ_gc holds arm static ✅- IK tracking accuracy test with DifferentialIKController ✅
- OSC tracking accuracy tests with gravity compensation ✅
- Multi-view per-view dirty isolation test ✅
4. Newton Dependency Update
newton @ git+...@v1.2.0rc2→newton==1.2.0rc3(packaged release)
5. Dexsuite Config Updates
- Switched from primitive shapes (CuboidCfg, SphereCfg) to mesh-based (MeshCuboidCfg, MeshSphereCfg)
- Consolidated physics presets into base
PhysicsCfgclass - Removed PhysX hardcoding from IK/OSC env configs (now inherit parent preset)
Code Quality Notes
✅ COM→origin Jacobian shift - The kernel correctly applies v_origin = v_com - ω × (R · com_offset_b) per DoF column
✅ Cache invalidation - All write paths properly invalidate _body_com_jacobian_w, _mass_matrix, _gravity_compensation_forces
✅ ProxyArray pattern - Lazy init pattern matches existing body_link_pose_w, joint_pos etc.
✅ Floating-base DoF layout - Matches industry convention (Pinocchio, Drake, MuJoCo): 6 base-DoF columns prepended
✅ FabricFrameView - Clean separation of indexed arrays per selection, proper sync kernels
Previous Fixes (still valid)
- Parent world scale via
Gf.Transform.GetScale()✅ wp.select()for branchless GPU execution ✅test_initial_seed_with_scaled_parentregression test ✅
Summary
This is a substantial feature addition that:
- Enables proper IK/OSC control on PhysX backend with correct Jacobian reference frames
- Fixes latent correctness bug where COM-referenced Jacobians were used with link-frame poses
- Adds comprehensive test coverage for cross-backend parity
- Cleans up physics preset inheritance
Approval recommended.
Update (e104e0e): Refactored if/else branching to wp.where() in Warp kernels (compose_fabric_transformation_matrix_from_warp_arrays, compose_indexed_fabric_transforms). This is a clean branchless pattern for GPU execution — good style improvement. No new concerns.
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
bab8ca0 to
5a04061
Compare
| parent_unit_scale = wp.array( | ||
| [[1.0, 1.0, 1.0]] * len(unique_parent_paths), | ||
| dtype=wp.float32, | ||
| device=self._device, | ||
| ) |
There was a problem hiding this comment.
Parent world scale silently forced to (1,1,1) during initial seed
tf.Orthonormalize() strips scale from the USD local-to-world transform, and the subsequent compose call uses a hard-coded parent_unit_scale = [[1.0, 1.0, 1.0]]. Any parent prim with a non-unit world scale will have the wrong matrix in Fabric from the start. Both sync kernels (update_indexed_local_matrix_from_world and update_indexed_world_matrix_from_local) read this parent matrix for every world↔local recomputation, so the error propagates into every child world and local pose after that.
| parent_unit_scale = wp.array( | |
| [[1.0, 1.0, 1.0]] * len(unique_parent_paths), | |
| dtype=wp.float32, | |
| device=self._device, | |
| ) | |
| parent_scale_rows: list[list[float]] = [] | |
| for path in unique_parent_paths: | |
| prim = usd_stage.GetPrimAtPath(path) | |
| full_tf = UsdGeom.XformCache(Usd.TimeCode.Default()).GetLocalToWorldTransform(prim) | |
| s = full_tf.ExtractScaleFactors() | |
| parent_scale_rows.append([float(s[0]), float(s[1]), float(s[2])]) | |
| parent_unit_scale = wp.array( | |
| parent_scale_rows, | |
| dtype=wp.float32, | |
| device=self._device, | |
| ) |
| # Compose into child localMatrix. | ||
| wp.launch( | ||
| kernel=fabric_utils.compose_indexed_fabric_transforms, | ||
| dim=self.count, | ||
| inputs=[ | ||
| self._local_ifa_rw, | ||
| _to_float32_2d(local_pos_ta.warp), | ||
| _to_float32_2d(local_ori_ta.warp), | ||
| self._fabric_empty_2d_array_sentinel, | ||
| False, | ||
| False, | ||
| False, | ||
| self._view_indices, | ||
| ], | ||
| device=self._device, | ||
| ) |
There was a problem hiding this comment.
Child local-matrix scale not seeded from USD on freshly authored stages
When SetLocalXformFromUsd() is a no-op (freshly authored stage, not yet rendered), the child's localMatrix in Fabric starts at identity with scale (1,1,1). The compose call passes _fabric_empty_2d_array_sentinel for the scale slot, leaving the existing (identity) scale in the matrix. If a child prim has non-unit local scale, the first set_local_poses → get_world_poses round-trip calls update_indexed_world_matrix_from_local with the wrong local scale, producing an incorrect world-space scale. The fix is to pass scales_wp (already fetched above from _usd_view.get_scales()).
| # Compose into child localMatrix. | |
| wp.launch( | |
| kernel=fabric_utils.compose_indexed_fabric_transforms, | |
| dim=self.count, | |
| inputs=[ | |
| self._local_ifa_rw, | |
| _to_float32_2d(local_pos_ta.warp), | |
| _to_float32_2d(local_ori_ta.warp), | |
| self._fabric_empty_2d_array_sentinel, | |
| False, | |
| False, | |
| False, | |
| self._view_indices, | |
| ], | |
| device=self._device, | |
| ) | |
| # Compose into child localMatrix (include scale so that | |
| # _sync_world_from_local_if_dirty produces the right world-space scale). | |
| wp.launch( | |
| kernel=fabric_utils.compose_indexed_fabric_transforms, | |
| dim=self.count, | |
| inputs=[ | |
| self._local_ifa_rw, | |
| _to_float32_2d(local_pos_ta.warp), | |
| _to_float32_2d(local_ori_ta.warp), | |
| _to_float32_2d(scales_wp), | |
| False, | |
| False, | |
| False, | |
| self._view_indices, | |
| ], | |
| device=self._device, | |
| ) |
Replaces the earlier Python-based parent-loop implementation (which was
correct but ~3× slower than USD for local poses) with a fully GPU-side
Fabric path that follows the bareya/pbarejko/camera-update prototype:
* Three persistent ``PrimSelection`` instances differing only in
per-attribute access mode — one each for {trans_ro, world_rw, local_rw}.
* Path-based view → fabric index mapping computed once from
``selection.GetPaths()`` and stored as a Warp ``int32`` array. No
custom prim attributes are written to the stage.
* All transform reads and writes go through ``wp.indexedfabricarray``,
so the kernels just dereference ``ifa[view_index]`` instead of taking
a separate mapping argument.
* Stage-level ``IFabricHierarchy`` cache and dirty-stages set so multiple
``FabricFrameView`` instances on the same stage share state.
World ↔ local consistency is preserved through Warp kernels that run on
the affected write paths:
* ``set_local_poses`` writes ``omni:fabric:localMatrix`` directly via the
compose kernel, then a second kernel recomputes child worldMatrix from
``parent_world * child_local`` so the next ``get_world_poses`` read is
consistent. ``IFabricHierarchy.update_world_xforms()`` is *not* used
for this — in practice it re-reads USD's authored xformOps and would
overwrite the matrices we just authored.
* ``set_world_poses`` mirrors the above, recomputing
``child_local = inv(parent_world) * child_world`` after the write.
Two new public Warp kernels in ``isaaclab.utils.warp.fabric``:
* ``decompose_indexed_fabric_transforms`` /
``compose_indexed_fabric_transforms`` — indexed-array variants of the
existing decompose/compose kernels.
* ``update_indexed_local_matrix_from_world`` /
``update_indexed_world_matrix_from_local`` — propagate one direction
using a parent indexed fabric array. Implemented directly in storage
convention (``local = world * inv(parent)``,
``world = local * parent``) — the four transposes the math-convention
form would imply cancel out under ``(A·B)^T = B^T·A^T``.
Benchmark (1024 prims, 50 iterations, RTX A6000):
Operation USD (ms) Fabric (ms) Speedup
Get World Poses 12.33 0.044 282×
Set World Poses 27.98 0.117 240×
Interleaved Set→Get 41.34 0.160 258×
Get Local Poses 6.04 0.037 162×
Set Local Poses 8.54 0.053 162×
Local-pose ops went from ~3× slower than USD (in the earlier
torch-based parent-loop implementation) to ~160× faster, with the
new Fabric-side localMatrix authoring keeping ``test_set_world_updates_local``
passing without an xfail override.
Tests: 41 passed in the Fabric backend's contract + new coverage for
the Fabric-native ``set_local_poses``, ``get_scales``, and topology
rebuild paths.
Drops the no-longer-needed ``cd571d482`` Python-loop attempt.
Three review fixes on the indexedfabricarray refactor: * set_scales wrote the new scale into worldMatrix but never refreshed localMatrix, so a subsequent get_local_poses returned the stale scale. Call _sync_local_from_world after the world write, matching set_world_poses. * The view->fabric mapping was stored in a single shared _fabric_indices field that each accessor overwrote from its own selection's GetPaths() ordering. Selections do not guarantee a shared path ordering, so this was brittle and hard to reason about. Cache the mapping per selection (_trans_ro_fabric_indices, _world_rw_fabric_indices, _local_rw_fabric_indices) and pass it explicitly to _build_indexed_array. The three trans_ro accessors now share a _rebuild_trans_ro_arrays helper. * Update two test comments that referenced the removed _fabric_usd_sync_done attribute to point at the lazy _initialize_fabric() call instead.
The chained set_world_poses -> get_world_poses round-trip in test_fabric_rebuild_after_topology_change goes through Warp's float32 SRT compose/decompose, which accumulates a few ULP of drift. At the test's position magnitudes (~4-6), one float32 ULP is ~4.77e-7, so the prior atol of 1e-7 demanded sub-ULP agreement and was sensitive to GPU/codegen variation -- it passed locally on the A6000 but flaked in CI. 1e-5 corresponds to roughly 20 ULP at those magnitudes: tight enough to catch any real bug (a wrong index or stale read would be at least ~1e-3 off given the test setup) and consistent with the shared contract harness in frame_view_contract_utils.py, which already documents and uses ATOL = 1e-5 for compose/decompose-through-float32 checks.
The previous _static_dirty_stages set was keyed by stage_id, but _sync_world_from_local_if_dirty only recomputes worldMatrix for the *calling view's* children before clearing the flag. With two views on the same stage, view B's world-read would clear the flag set by view A's set_local_poses, leaving A's worlds silently stale. Replace the class-level set with a per-instance bool. Each view now tracks its own dirty state, which matches the actual scope of the recompute kernel and removes a mutable ClassVar. Also: - Raise a clearer error when _compute_parent_fabric_indices is asked to look up the parent of a root-level prim (rsplit produces ""), instead of bubbling up the generic "not found in selection" message. - Document on the remaining _static_hierarchy_cache that it is not thread-safe by design (Isaac Lab's loop is single-threaded; adding a lock would negate the per-stage caching benefit). - Update the module docstring to reflect the per-view dirty model and drop the stale reference to IFabricHierarchy.update_world_xforms.
Three review fixes plus the missing coverage for the transpose
storage convention.
* Hierarchy cache eviction: cache key is now (stage_id, fabric_id) so a
recycled stage_id paired with a new Fabric attachment never returns a
stale handle. Added FabricFrameView.clear_static_caches() classmethod
for explicit teardown and wired it into the test fixture so cached
handles do not accumulate across the suite.
* PrepareForReuse no longer fires twice per sync. Both _sync_local_from_world
and _sync_world_from_local_if_dirty refresh trans_sel_ro exactly once,
then read _world_ifa_ro / _local_ifa_ro / _parent_world_ifa_ro directly
from the fields instead of going through the accessors.
* Class docstring rewritten to describe the actual PrepareForReuse policy
(every accessor calls it; idempotent and cheap in the steady state).
The prior wording claimed reads avoided PrepareForReuse, which has not
been true since the indexedfabricarray rewrite.
* New regression tests for the transpose storage convention. The standard
fixture parents are translation-only, so the rotation block is identity
and equal to its transpose - which means a wrong transpose convention
would still pass every existing test. Two new tests place a parent
rotated 90 degrees around Z and verify the world<->local round-trip:
- test_set_local_then_get_world_with_rotated_parent exercises
update_indexed_world_matrix_from_local
- test_set_world_then_get_local_with_rotated_parent exercises
update_indexed_local_matrix_from_world
Confirmed locally that flipping the multiply order in either kernel
makes the matching test fail.
45 tests pass on cpu and cuda:0.
FabricFrameView is referenced by fully-qualified name in the migration guide and in this PR's changelog fragment, but no RST file documented the module - so the Sphinx :class: and :meth: cross-refs were not resolvable. Add a thin automodule page mirroring the sibling pages under docs/source/api/lab_physx/ and register it in the API index toctree. This also picks up the new clear_static_caches() classmethod automatically via :members:.
The submodule was not surfaced anywhere in the Sphinx tree, so :func: cross-references to its kernels (added in the changelog fragment for this branch and used by FabricFrameView throughout) did not resolve. Add a Warp Fabric kernels subsection to isaaclab.utils.rst that automodule's the submodule, and add __all__ to fabric.py so the generated page lists only the eight public kernels - the type aliases (FabricArrayMat44d, ArrayUInt32, ...) and the re-imported `wp` / `TYPE_CHECKING` / `Any` symbols stay out of the rendered docs. The page covers both the pre-existing kernels (compose/decompose_fabric_transformation_matrix_*_warp_arrays, set_view_to_fabric_array, arange_k) and the four kernels added on this branch.
_sync_fabric_from_usd_initial had two scale-related bugs in the USD->Fabric seed path that produced silently wrong matrices whenever a parent or child had a non-unit scale. Both kernels that recompute world<->local consistency read those seeded matrices, so the error propagated. * Parent worldMatrix was composed with a hardcoded (1, 1, 1) scale. Orthonormalize() strips scale from the local-to-world transform, so we now extract the scale via Gf.Transform.GetScale() *before* orthonormalizing and pass it through to the compose kernel. * Child localMatrix was composed with the empty-array sentinel for the scale slot, leaving the kernel-side scale at the identity default. We now pass the locally-authored scale (already fetched via _usd_view.get_scales()) so the matrix carries the right scale. * Child worldMatrix is still composed from get_world_poses() position and orientation plus the child's local scale, which is wrong when a parent has non-unit world scale. Instead of fixing the seed by hand (would require per-child world-scale lookups), mark the view dirty at the end of the seed. The very next world read fires _sync_world_from_local_if_dirty, which computes child_world = parent_world * child_local on the GPU - and with both matrices now correctly scaled, the multiply produces the right world-space scale automatically. Added test_initial_seed_with_scaled_parent regression test: parent world scale (2, 1, 1), child local scale (3, 1, 1). Locally verified the test fails when either fix is reverted in isolation.
While adding a regression test for the per-view world-dirty flag, I discovered the IFabricHierarchy cache silently misses on every view because ``Stage.GetFabricId()`` returns a fresh ``FabricId`` wrapper on every call, with no value equality between wrappers for the same underlying Fabric. The cache stored (stage_id, wrapper) tuples, so two views on the same stage produced two distinct cache keys and re-fetched the hierarchy on each init. The bug was harmless in practice -- USDRT's ``get_fabric_hierarchy`` itself returns a process-wide singleton per Fabric stage, so both views happened to end up with the same handle anyway -- but the cache wasn't doing the work it was advertised to do. Fix: key the cache on ``(stage_id, fabric_id.id)`` where ``.id`` is the stable ``int`` underneath the wrapper. The new test exercises the multi-view-per-stage scenario: * two FabricFrameView instances on disjoint child prims under different parent sub-trees of one stage * writes on view A must not dirty view B * world reads on view B must not clear view A's dirty flag * both views share one cached IFabricHierarchy and the cache has exactly one entry after both inits (this assertion is what surfaced the FabricId-wrapper bug) * symmetric pass: writes on B must not affect A's post-read state Verified locally that the test fails with both the wrapper-keyed cache (cache size > 1) and with a synthetic stage-shared dirty flag (cross-view stomp). Module-level coverage of fabric_frame_view.py with this test added: 85% line / 78% branch. Remaining uncovered code is the USD-fallback delegations (Fabric disabled), defensive RuntimeError raises, and topology-rebuild branches inside the accessors. 49 tests pass on cpu and cuda:0.
5b4f91e to
bb778db
Compare
… kernels Replace if/else broadcast branching with wp.where() for branchless predicated selection. wp.select was deprecated in Warp 1.7 and removed in 1.10; wp.where has the more intuitive (cond, true, false) order.
Description
FabricFrameView.set_world_posesupdated Fabric'somni:fabric:worldMatrix, butget_local_posesstill delegated to USD — so after a world write the next local read returned stale values until USD was re-synced. Symmetrically,set_local_poseswent through USD and Fabric never saw the write until the next sync. The shared contract testtest_set_world_updates_localwas markedxfail(strict=True)to acknowledge this.This PR replaces the USD-fallback local-pose path with a fully Fabric-native implementation. World and local matrices are read and written directly on
omni:fabric:worldMatrix/omni:fabric:localMatrixviawp.indexedfabricarray; consistency in both directions is maintained by Warp kernels.The design rests on three pieces: three persistent
PrimSelectioninstances differing only in per-attribute access mode (trans_ro,world_rw,local_rw); path-based view→fabric index mapping (no custom prim attributes written); andwp.indexedfabricarrayfor every (selection × attribute) pair, which bakes the view→fabric mapping into the array itself so kernels read elementiof the view directly instead of taking a separate index-mapping argument. World↔local consistency is maintained entirely on the GPU through two Warp kernels — no Python-side parent walks, no per-call USD reads. The result is faster than the USD baseline by two orders of magnitude on every measured operation (see Benchmark results below).Implementation summary
PrimSelectioninstances —trans_ro(read both matrices),world_rw(write world, read local),local_rw(read world, write local). Each selection caches its own view→fabric index mapping built fromselection.GetPaths(); selections do not guarantee a shared path ordering, so the mappings are kept independent.isaaclab:view_index:HASHattributes written anywhere on the prims. Fabric groups prims into buckets by their attribute schema; writing a per-view index attribute would split each touched bucket into a "with-attr" and "without-attr" half and proliferate further with each additional view on the stage. Keeping the mapping out of Fabric leaves the bucket layout untouched, which keepsSelectPrimsenumeration andwp.fabricarrayreads more contiguous.wp.indexedfabricarrayeverywhere — bakes the mapping into the array itself, removing a per-launch indirection.set_local_posesthe view is marked dirty; the next world read fires a Warp kernel that recomputeschild_world = parent_world * child_localfor this view's children. Tracking is per-view (not per-stage) so multipleFabricFrameViewinstances on the same stage cannot clear each other's dirty flag.set_world_poses/set_scales, a mirror Warp kernel recomputeschild_local = inv(parent_world) * child_worldso subsequentget_local_posesreturns consistent values.IFabricHierarchycache keyed by(stage_id, fabric_id)so a recycledstage_idpaired with a new Fabric attachment never returns a stale handle.FabricFrameView.clear_static_caches()is the explicit teardown hook.isaaclab.utils.warp.fabric:decompose_indexed_fabric_transforms,compose_indexed_fabric_transforms,update_indexed_local_matrix_from_world,update_indexed_world_matrix_from_local.Transpose-storage convention
Fabric stores 4×4 transforms in column-transposed form. The two sync kernels apply the identity
(A·B)ᵀ = Bᵀ·Aᵀ(andinv(A)ᵀ = inv(Aᵀ)) to compute the equivalent multiply on the stored matrices without explicit transposes:local = inv(parent) · world→ storedlocal = world · inv(parent)world = parent · local→ storedworld = local · parentTwo new regression tests pin this convention; see Tests below.
Benchmark results
1024 prims, 50 iterations, NVIDIA RTX A6000.
Reproduce with:
Tests
Local run of
source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.pyoncpuandcuda:0: 45 passed, 0 failed.xfail(strict=True)on the sharedtest_set_world_updates_local— it now passes.test_set_local_via_fabric_path—set_local_poseswritesomni:fabric:localMatrix; the deferredworld = parent * localrecompute lands the right world pose.test_get_scales_fabric_path— exercises the Fabric-nativeget_scales.test_prepare_for_reuse_detects_topology_change— confirmsPrimSelection.PrepareForReuseis wired up and returns the expected topology-change signal.test_fabric_rebuild_after_topology_change— simulates a topology change and confirms the indexed arrays rebuild and continue producing correct results.test_set_local_then_get_world_with_rotated_parent,test_set_world_then_get_local_with_rotated_parent). The standard fixture's parents are translation-only, so the rotation block is identity and equal to its own transpose — a wrong transpose convention in either sync kernel would silently pass every other test. These two tests place the parent at a non-trivial rotation (+90° around Z) and verify the round-trip in both directions. Locally confirmed that flipping the multiply order in either of the two sync kernels makes the matching test fail.Iterations during review
localMatrixafterset_scales— added the missing_sync_local_from_worldcall so a write through scale also propagates the new diagonal back into the local matrix._fabric_indicesacross selections — each selection now caches its own indices array;_build_indexed_arraytakes them as an explicit parameter, removing the silent-corruption hazard the shared field invited.set_local_poseshad set (which would have left A's worlds silently stale)._static_hierarchy_cachelifecycle — cache key now(stage_id, fabric_id); addedFabricFrameView.clear_static_caches()and call it from the test teardown fixture so handles do not accumulate across the suite.PrepareForReusein syncs — both_sync_local_from_worldand_sync_world_from_local_if_dirtynow refreshtrans_sel_roexactly once per call, then read_world_ifa_ro,_local_ifa_roand_parent_world_ifa_rodirectly from the fields, avoiding the duplicated USDRT round-trip.PrepareForReuse— the class docstring now describes the actual behaviour: every accessor calls it; it is cheap and idempotent in the steady state._compute_parent_fabric_indicesnow raises a dedicated error when a prim has no parent path, rather than bubbling up the generic "not found in selection" message.Type of change
Checklist
./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there