perf: Implement Fabric-aware get/set_local_poses on FabricFrameView by pv-nvidia · Pull Request #5516 · isaac-sim/IsaacLab

pv-nvidia · 2026-05-06T12:41:09Z

Description

FabricFrameView.set_world_poses updated Fabric's omni:fabric:worldMatrix, but get_local_poses still delegated to USD — so after a world write the next local read returned stale values until USD was re-synced. Symmetrically, set_local_poses went through USD and Fabric never saw the write until the next sync. The shared contract test test_set_world_updates_local was marked xfail(strict=True) to acknowledge this.

This PR replaces the USD-fallback local-pose path with a fully Fabric-native implementation. World and local matrices are read and written directly on omni:fabric:worldMatrix / omni:fabric:localMatrix via wp.indexedfabricarray; consistency in both directions is maintained by Warp kernels.

The design rests on three pieces: three persistent PrimSelection instances differing only in per-attribute access mode (trans_ro, world_rw, local_rw); path-based view→fabric index mapping (no custom prim attributes written); and wp.indexedfabricarray for every (selection × attribute) pair, which bakes the view→fabric mapping into the array itself so kernels read element i of the view directly instead of taking a separate index-mapping argument. World↔local consistency is maintained entirely on the GPU through two Warp kernels — no Python-side parent walks, no per-call USD reads. The result is faster than the USD baseline by two orders of magnitude on every measured operation (see Benchmark results below).

Implementation summary

Three persistent PrimSelection instances — trans_ro (read both matrices), world_rw (write world, read local), local_rw (read world, write local). Each selection caches its own view→fabric index mapping built from selection.GetPaths(); selections do not guarantee a shared path ordering, so the mappings are kept independent.
Path-based view→fabric index mapping — no isaaclab:view_index:HASH attributes written anywhere on the prims. Fabric groups prims into buckets by their attribute schema; writing a per-view index attribute would split each touched bucket into a "with-attr" and "without-attr" half and proliferate further with each additional view on the stage. Keeping the mapping out of Fabric leaves the bucket layout untouched, which keeps SelectPrims enumeration and wp.fabricarray reads more contiguous.
wp.indexedfabricarray everywhere — bakes the mapping into the array itself, removing a per-launch indirection.
Per-view dirty flag for world↔local consistency:
- After set_local_poses the view is marked dirty; the next world read fires a Warp kernel that recomputes child_world = parent_world * child_local for this view's children. Tracking is per-view (not per-stage) so multiple FabricFrameView instances on the same stage cannot clear each other's dirty flag.
- After set_world_poses / set_scales, a mirror Warp kernel recomputes child_local = inv(parent_world) * child_world so subsequent get_local_poses returns consistent values.
Stage-level IFabricHierarchy cache keyed by (stage_id, fabric_id) so a recycled stage_id paired with a new Fabric attachment never returns a stale handle. FabricFrameView.clear_static_caches() is the explicit teardown hook.
Four new public Warp kernels in isaaclab.utils.warp.fabric: decompose_indexed_fabric_transforms, compose_indexed_fabric_transforms, update_indexed_local_matrix_from_world, update_indexed_world_matrix_from_local.

Transpose-storage convention

Fabric stores 4×4 transforms in column-transposed form. The two sync kernels apply the identity (A·B)ᵀ = Bᵀ·Aᵀ (and inv(A)ᵀ = inv(Aᵀ)) to compute the equivalent multiply on the stored matrices without explicit transposes:

math local = inv(parent) · world → stored local = world · inv(parent)
math world = parent · local → stored world = local · parent

Two new regression tests pin this convention; see Tests below.

Benchmark results

1024 prims, 50 iterations, NVIDIA RTX A6000.

Operation	USD (ms)	Fabric (ms)	Speedup
Get World Poses	13.4	0.057	238×
Set World Poses	31.3	0.139	226×
Interleaved Set→Get	44.6	0.172	259×
Get Local Poses	6.1	0.039	158×
Set Local Poses	8.5	0.054	159×

Reproduce with:

./isaaclab.sh -p scripts/benchmarks/benchmark_view_comparison.py \
  --num_envs 1024 --num_iterations 50 --backends usd fabric \
  --device cuda:0 --headless

Tests

Local run of source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py on cpu and cuda:0: 45 passed, 0 failed.

Removed the xfail(strict=True) on the shared test_set_world_updates_local — it now passes.
test_set_local_via_fabric_path — set_local_poses writes omni:fabric:localMatrix; the deferred world = parent * local recompute lands the right world pose.
test_get_scales_fabric_path — exercises the Fabric-native get_scales.
test_prepare_for_reuse_detects_topology_change — confirms PrimSelection.PrepareForReuse is wired up and returns the expected topology-change signal.
test_fabric_rebuild_after_topology_change — simulates a topology change and confirms the indexed arrays rebuild and continue producing correct results.
Transpose-convention verification (test_set_local_then_get_world_with_rotated_parent, test_set_world_then_get_local_with_rotated_parent). The standard fixture's parents are translation-only, so the rotation block is identity and equal to its own transpose — a wrong transpose convention in either sync kernel would silently pass every other test. These two tests place the parent at a non-trivial rotation (+90° around Z) and verify the round-trip in both directions. Locally confirmed that flipping the multiply order in either of the two sync kernels makes the matching test fail.

Iterations during review

Stale localMatrix after set_scales — added the missing _sync_local_from_world call so a write through scale also propagates the new diagonal back into the local matrix.
Shared _fabric_indices across selections — each selection now caches its own indices array; _build_indexed_array takes them as an explicit parameter, removing the silent-corruption hazard the shared field invited.
Stage-keyed dirty flag — replaced with a per-view bool. With multiple views on a stage, view B reading worlds no longer clears the flag that view A's set_local_poses had set (which would have left A's worlds silently stale).
_static_hierarchy_cache lifecycle — cache key now (stage_id, fabric_id); added FabricFrameView.clear_static_caches() and call it from the test teardown fixture so handles do not accumulate across the suite.
Redundant PrepareForReuse in syncs — both _sync_local_from_world and _sync_world_from_local_if_dirty now refresh trans_sel_ro exactly once per call, then read _world_ifa_ro, _local_ifa_ro and _parent_world_ifa_ro directly from the fields, avoiding the duplicated USDRT round-trip.
Docstring vs reality on PrepareForReuse — the class docstring now describes the actual behaviour: every accessor calls it; it is cheap and idempotent in the steady state.
Clearer error for root-level prims — _compute_parent_fabric_indices now raises a dedicated error when a prim has no parent path, rather than bubbling up the generic "not found in selection" message.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

Per the fragment-based changelog system (#5434), this PR ships fragments at source/isaaclab/changelog.d/fix-fabric-local-matrix.rst and source/isaaclab_physx/changelog.d/fix-fabric-local-matrix.rst instead of editing CHANGELOG.rst and config/extension.toml directly — the nightly CI workflow compiles those from the fragments.

greptile-apps · 2026-05-06T12:46:32Z

Greptile Summary

This PR implements a fully Fabric-native get_local_poses / set_local_poses path on FabricFrameView, replacing the previous USD-fallback that caused stale reads after Fabric world-pose writes. It introduces four new Warp kernels for indexed Fabric matrix I/O and world↔local consistency, three persistent PrimSelection instances with per-selection index caches, a per-view dirty flag, and a (stage_id, fabric_id)-keyed IFabricHierarchy cache with an explicit clear_static_caches() teardown hook.

All previously flagged issues from review threads are addressed: per-selection fabric-index caches eliminate the shared-index corruption hazard; per-view _world_dirty flag prevents cross-view dirty-flag leakage; _static_hierarchy_cache key is (stage_id, fabric_id); clear_static_caches() is added and wired into test teardown.
New Fabric-native local-pose path: set_local_poses writes omni:fabric:localMatrix directly and marks the view dirty; the next get_world_poses fires update_indexed_world_matrix_from_local; symmetrically set_world_poses fires update_indexed_local_matrix_from_world so subsequent get_local_poses is consistent.
_sync_fabric_from_usd_initial seeds parent world matrices with hard-coded unit scale and child local matrices without scale, so prims with non-unit scale will produce incorrect world/local matrices after the first sync-kernel invocation.

Confidence Score: 3/5

Safe to merge for stages with translation-only parent prims; stages where any parent or child prim carries a non-unit scale will produce incorrect world/local matrices after the first set_local_poses → get_world_poses round-trip.

The initial Fabric seed in _sync_fabric_from_usd_initial calls Orthonormalize() on every parent transform and then writes the parent world matrix with a hard-coded (1,1,1) scale, silently discarding any real parent scale. The child local matrix is seeded without scale for the same reason. Both sync kernels read the parent world matrix on every world↔local recomputation, so a wrong parent scale corrupts every child pose update after the first dirty-flag sync.

_sync_fabric_from_usd_initial in fabric_frame_view.py — the parent-world-matrix and child-local-matrix seeding blocks.

Important Files Changed

Filename	Overview
source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py	Core implementation of Fabric-native local-pose path; all previously flagged issues addressed, but _sync_fabric_from_usd_initial seeds parent world matrices with unit scale and child local matrices without scale, causing incorrect sync for prims with non-unit scale.
source/isaaclab/isaaclab/utils/warp/fabric.py	Four new Warp kernels for indexed Fabric matrix decompose/compose and world↔local sync; transpose-convention is correct and well-documented.
source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py	xfail removed, six new tests added including transpose-convention regression tests with a rotated parent; clear_static_caches() wired into autouse teardown fixture.
scripts/benchmarks/benchmark_view_comparison.py	Added interleaved set→get and local-pose benchmarks; ProxyArray compatibility handled defensively; straightforward extension.

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant FFV as FabricFrameView
    participant WK as Warp Kernels
    participant FA as Fabric (GPU)
    participant USD as USD/UsdGeom

    Note over FFV: _initialize_fabric()
    FFV->>USD: get_world_poses / get_local_poses / get_scales
    FFV->>USD: XformCache (parents)
    FFV->>FA: compose_indexed_fabric_transforms (world+local, children)
    FFV->>FA: compose_indexed_fabric_transforms (world, parents)

    Note over FFV: set_world_poses()
    C->>FFV: set_world_poses(positions, orientations)
    FFV->>WK: compose_indexed_fabric_transforms worldMatrix[rw]
    WK->>FA: write worldMatrix
    FFV->>WK: update_indexed_local_matrix_from_world
    WK->>FA: read worldMatrix[ro], parentWorld[ro]
    WK->>FA: write localMatrix[rw]

    Note over FFV: set_local_poses()
    C->>FFV: set_local_poses(translations, orientations)
    FFV->>WK: compose_indexed_fabric_transforms localMatrix[rw]
    WK->>FA: write localMatrix
    FFV->>FFV: "_world_dirty=True"

    Note over FFV: get_world_poses() dirty path
    C->>FFV: get_world_poses()
    FFV->>WK: update_indexed_world_matrix_from_local
    WK->>FA: read localMatrix[ro], parentWorld[ro]
    WK->>FA: write worldMatrix[rw]
    FFV->>FFV: "_world_dirty=False"
    FFV->>WK: decompose_indexed_fabric_transforms
    WK-->>C: (positions, orientations)

_{Reviews (7): Last reviewed commit: "Document isaaclab.utils.warp.fabric kern..." | Re-trigger Greptile}

isaaclab-review-bot

Review: Fabric-aware get/set_local_poses + IK/OSC Integration

Updated review after new commits (5b4f91e → bb778db)

New Changes in This Push

1. Task-space dynamics accessors (Major Feature)

Added body_link_jacobian_w, body_com_jacobian_w, mass_matrix, gravity_compensation_forces to ArticulationData
New shift_jacobian_com_to_origin Warp kernel converts PhysX COM-referenced Jacobians to link-origin
Proper cache invalidation in write paths (write_root_*_to_sim_index, write_joint_*_to_sim_index)
Full DoF axis preserved (6 floating-base columns for floating-base assets)

2. FabricFrameView Major Rework

Three persistent PrimSelection instances (trans_ro, world_rw, local_rw)
Path-based view→fabric index mapping via wp.indexedfabricarray
Proper parent-indexed arrays for world↔local transforms
Per-view dirty tracking (not per-stage) to avoid cross-view corruption
Class-level _static_hierarchy_cache for IFabricHierarchy handles
clear_static_caches() method for test cleanup

3. Extensive Test Coverage

Shape contract tests for Jacobian/mass matrix on fixed & floating base
test_get_jacobians_link_origin_contract: J·q̇ = body velocities ✅
test_get_mass_matrix_symmetry_pd: square, symmetric, positive-definite ✅
test_get_gravity_compensation_forces_static_equilibrium: τ_gc holds arm static ✅
IK tracking accuracy test with DifferentialIKController ✅
OSC tracking accuracy tests with gravity compensation ✅
Multi-view per-view dirty isolation test ✅

4. Newton Dependency Update

newton @ git+...@v1.2.0rc2 → newton==1.2.0rc3 (packaged release)

5. Dexsuite Config Updates

Switched from primitive shapes (CuboidCfg, SphereCfg) to mesh-based (MeshCuboidCfg, MeshSphereCfg)
Consolidated physics presets into base PhysicsCfg class
Removed PhysX hardcoding from IK/OSC env configs (now inherit parent preset)

Code Quality Notes

✅ COM→origin Jacobian shift - The kernel correctly applies v_origin = v_com - ω × (R · com_offset_b) per DoF column

✅ Cache invalidation - All write paths properly invalidate _body_com_jacobian_w, _mass_matrix, _gravity_compensation_forces

✅ ProxyArray pattern - Lazy init pattern matches existing body_link_pose_w, joint_pos etc.

✅ Floating-base DoF layout - Matches industry convention (Pinocchio, Drake, MuJoCo): 6 base-DoF columns prepended

✅ FabricFrameView - Clean separation of indexed arrays per selection, proper sync kernels

Previous Fixes (still valid)

Parent world scale via Gf.Transform.GetScale() ✅
wp.select() for branchless GPU execution ✅
test_initial_seed_with_scaled_parent regression test ✅

Summary

This is a substantial feature addition that:

Enables proper IK/OSC control on PhysX backend with correct Jacobian reference frames
Fixes latent correctness bug where COM-referenced Jacobians were used with link-frame poses
Adds comprehensive test coverage for cross-backend parity
Cleans up physics preset inheritance

Approval recommended.

Update (e104e0e): Refactored if/else branching to wp.where() in Warp kernels (compose_fabric_transformation_matrix_from_warp_arrays, compose_indexed_fabric_transforms). This is a clean branchless pattern for GPU execution — good style improvement. No new concerns.

greptile-apps · 2026-05-11T14:50:53Z

Want your agent to iterate on Greptile's feedback? Try greploops.

greptile-apps · 2026-05-12T16:14:28Z

+            parent_unit_scale = wp.array(
+                [[1.0, 1.0, 1.0]] * len(unique_parent_paths),
+                dtype=wp.float32,
+                device=self._device,
+            )


Parent world scale silently forced to (1,1,1) during initial seed

tf.Orthonormalize() strips scale from the USD local-to-world transform, and the subsequent compose call uses a hard-coded parent_unit_scale = [[1.0, 1.0, 1.0]]. Any parent prim with a non-unit world scale will have the wrong matrix in Fabric from the start. Both sync kernels (update_indexed_local_matrix_from_world and update_indexed_world_matrix_from_local) read this parent matrix for every world↔local recomputation, so the error propagates into every child world and local pose after that.

Suggested change

parent_unit_scale = wp.array(

[[1.0, 1.0, 1.0]] * len(unique_parent_paths),

dtype=wp.float32,

device=self._device,

)

parent_scale_rows: list[list[float]] = []

for path in unique_parent_paths:

prim = usd_stage.GetPrimAtPath(path)

full_tf = UsdGeom.XformCache(Usd.TimeCode.Default()).GetLocalToWorldTransform(prim)

s = full_tf.ExtractScaleFactors()

parent_scale_rows.append([float(s[0]), float(s[1]), float(s[2])])

parent_unit_scale = wp.array(

parent_scale_rows,

dtype=wp.float32,

device=self._device,

)

greptile-apps · 2026-05-12T16:14:29Z

+        # Compose into child localMatrix.
+        wp.launch(
+            kernel=fabric_utils.compose_indexed_fabric_transforms,
+            dim=self.count,
+            inputs=[
+                self._local_ifa_rw,
+                _to_float32_2d(local_pos_ta.warp),
+                _to_float32_2d(local_ori_ta.warp),
+                self._fabric_empty_2d_array_sentinel,
+                False,
+                False,
+                False,
+                self._view_indices,
+            ],
+            device=self._device,
+        )


Child local-matrix scale not seeded from USD on freshly authored stages

When SetLocalXformFromUsd() is a no-op (freshly authored stage, not yet rendered), the child's localMatrix in Fabric starts at identity with scale (1,1,1). The compose call passes _fabric_empty_2d_array_sentinel for the scale slot, leaving the existing (identity) scale in the matrix. If a child prim has non-unit local scale, the first set_local_poses → get_world_poses round-trip calls update_indexed_world_matrix_from_local with the wrong local scale, producing an incorrect world-space scale. The fix is to pass scales_wp (already fetched above from _usd_view.get_scales()).

Suggested change

# Compose into child localMatrix.

wp.launch(

kernel=fabric_utils.compose_indexed_fabric_transforms,

dim=self.count,

inputs=[

self._local_ifa_rw,

_to_float32_2d(local_pos_ta.warp),

_to_float32_2d(local_ori_ta.warp),

self._fabric_empty_2d_array_sentinel,

False,

False,

False,

self._view_indices,

],

device=self._device,

)

# Compose into child localMatrix (include scale so that

# _sync_world_from_local_if_dirty produces the right world-space scale).

wp.launch(

kernel=fabric_utils.compose_indexed_fabric_transforms,

dim=self.count,

inputs=[

self._local_ifa_rw,

_to_float32_2d(local_pos_ta.warp),

_to_float32_2d(local_ori_ta.warp),

_to_float32_2d(scales_wp),

False,

False,

False,

self._view_indices,

],

device=self._device,

)

Replaces the earlier Python-based parent-loop implementation (which was correct but ~3× slower than USD for local poses) with a fully GPU-side Fabric path that follows the bareya/pbarejko/camera-update prototype: * Three persistent ``PrimSelection`` instances differing only in per-attribute access mode — one each for {trans_ro, world_rw, local_rw}. * Path-based view → fabric index mapping computed once from ``selection.GetPaths()`` and stored as a Warp ``int32`` array. No custom prim attributes are written to the stage. * All transform reads and writes go through ``wp.indexedfabricarray``, so the kernels just dereference ``ifa[view_index]`` instead of taking a separate mapping argument. * Stage-level ``IFabricHierarchy`` cache and dirty-stages set so multiple ``FabricFrameView`` instances on the same stage share state. World ↔ local consistency is preserved through Warp kernels that run on the affected write paths: * ``set_local_poses`` writes ``omni:fabric:localMatrix`` directly via the compose kernel, then a second kernel recomputes child worldMatrix from ``parent_world * child_local`` so the next ``get_world_poses`` read is consistent. ``IFabricHierarchy.update_world_xforms()`` is *not* used for this — in practice it re-reads USD's authored xformOps and would overwrite the matrices we just authored. * ``set_world_poses`` mirrors the above, recomputing ``child_local = inv(parent_world) * child_world`` after the write. Two new public Warp kernels in ``isaaclab.utils.warp.fabric``: * ``decompose_indexed_fabric_transforms`` / ``compose_indexed_fabric_transforms`` — indexed-array variants of the existing decompose/compose kernels. * ``update_indexed_local_matrix_from_world`` / ``update_indexed_world_matrix_from_local`` — propagate one direction using a parent indexed fabric array. Implemented directly in storage convention (``local = world * inv(parent)``, ``world = local * parent``) — the four transposes the math-convention form would imply cancel out under ``(A·B)^T = B^T·A^T``. Benchmark (1024 prims, 50 iterations, RTX A6000): Operation USD (ms) Fabric (ms) Speedup Get World Poses 12.33 0.044 282× Set World Poses 27.98 0.117 240× Interleaved Set→Get 41.34 0.160 258× Get Local Poses 6.04 0.037 162× Set Local Poses 8.54 0.053 162× Local-pose ops went from ~3× slower than USD (in the earlier torch-based parent-loop implementation) to ~160× faster, with the new Fabric-side localMatrix authoring keeping ``test_set_world_updates_local`` passing without an xfail override. Tests: 41 passed in the Fabric backend's contract + new coverage for the Fabric-native ``set_local_poses``, ``get_scales``, and topology rebuild paths. Drops the no-longer-needed ``cd571d482`` Python-loop attempt.

Three review fixes on the indexedfabricarray refactor: * set_scales wrote the new scale into worldMatrix but never refreshed localMatrix, so a subsequent get_local_poses returned the stale scale. Call _sync_local_from_world after the world write, matching set_world_poses. * The view->fabric mapping was stored in a single shared _fabric_indices field that each accessor overwrote from its own selection's GetPaths() ordering. Selections do not guarantee a shared path ordering, so this was brittle and hard to reason about. Cache the mapping per selection (_trans_ro_fabric_indices, _world_rw_fabric_indices, _local_rw_fabric_indices) and pass it explicitly to _build_indexed_array. The three trans_ro accessors now share a _rebuild_trans_ro_arrays helper. * Update two test comments that referenced the removed _fabric_usd_sync_done attribute to point at the lazy _initialize_fabric() call instead.

The chained set_world_poses -> get_world_poses round-trip in test_fabric_rebuild_after_topology_change goes through Warp's float32 SRT compose/decompose, which accumulates a few ULP of drift. At the test's position magnitudes (~4-6), one float32 ULP is ~4.77e-7, so the prior atol of 1e-7 demanded sub-ULP agreement and was sensitive to GPU/codegen variation -- it passed locally on the A6000 but flaked in CI. 1e-5 corresponds to roughly 20 ULP at those magnitudes: tight enough to catch any real bug (a wrong index or stale read would be at least ~1e-3 off given the test setup) and consistent with the shared contract harness in frame_view_contract_utils.py, which already documents and uses ATOL = 1e-5 for compose/decompose-through-float32 checks.

The previous _static_dirty_stages set was keyed by stage_id, but _sync_world_from_local_if_dirty only recomputes worldMatrix for the *calling view's* children before clearing the flag. With two views on the same stage, view B's world-read would clear the flag set by view A's set_local_poses, leaving A's worlds silently stale. Replace the class-level set with a per-instance bool. Each view now tracks its own dirty state, which matches the actual scope of the recompute kernel and removes a mutable ClassVar. Also: - Raise a clearer error when _compute_parent_fabric_indices is asked to look up the parent of a root-level prim (rsplit produces ""), instead of bubbling up the generic "not found in selection" message. - Document on the remaining _static_hierarchy_cache that it is not thread-safe by design (Isaac Lab's loop is single-threaded; adding a lock would negate the per-stage caching benefit). - Update the module docstring to reflect the per-view dirty model and drop the stale reference to IFabricHierarchy.update_world_xforms.

Three review fixes plus the missing coverage for the transpose storage convention. * Hierarchy cache eviction: cache key is now (stage_id, fabric_id) so a recycled stage_id paired with a new Fabric attachment never returns a stale handle. Added FabricFrameView.clear_static_caches() classmethod for explicit teardown and wired it into the test fixture so cached handles do not accumulate across the suite. * PrepareForReuse no longer fires twice per sync. Both _sync_local_from_world and _sync_world_from_local_if_dirty refresh trans_sel_ro exactly once, then read _world_ifa_ro / _local_ifa_ro / _parent_world_ifa_ro directly from the fields instead of going through the accessors. * Class docstring rewritten to describe the actual PrepareForReuse policy (every accessor calls it; idempotent and cheap in the steady state). The prior wording claimed reads avoided PrepareForReuse, which has not been true since the indexedfabricarray rewrite. * New regression tests for the transpose storage convention. The standard fixture parents are translation-only, so the rotation block is identity and equal to its transpose - which means a wrong transpose convention would still pass every existing test. Two new tests place a parent rotated 90 degrees around Z and verify the world<->local round-trip: - test_set_local_then_get_world_with_rotated_parent exercises update_indexed_world_matrix_from_local - test_set_world_then_get_local_with_rotated_parent exercises update_indexed_local_matrix_from_world Confirmed locally that flipping the multiply order in either kernel makes the matching test fail. 45 tests pass on cpu and cuda:0.

FabricFrameView is referenced by fully-qualified name in the migration guide and in this PR's changelog fragment, but no RST file documented the module - so the Sphinx :class: and :meth: cross-refs were not resolvable. Add a thin automodule page mirroring the sibling pages under docs/source/api/lab_physx/ and register it in the API index toctree. This also picks up the new clear_static_caches() classmethod automatically via :members:.

The submodule was not surfaced anywhere in the Sphinx tree, so :func: cross-references to its kernels (added in the changelog fragment for this branch and used by FabricFrameView throughout) did not resolve. Add a Warp Fabric kernels subsection to isaaclab.utils.rst that automodule's the submodule, and add __all__ to fabric.py so the generated page lists only the eight public kernels - the type aliases (FabricArrayMat44d, ArrayUInt32, ...) and the re-imported `wp` / `TYPE_CHECKING` / `Any` symbols stay out of the rendered docs. The page covers both the pre-existing kernels (compose/decompose_fabric_transformation_matrix_*_warp_arrays, set_view_to_fabric_array, arange_k) and the four kernels added on this branch.

_sync_fabric_from_usd_initial had two scale-related bugs in the USD->Fabric seed path that produced silently wrong matrices whenever a parent or child had a non-unit scale. Both kernels that recompute world<->local consistency read those seeded matrices, so the error propagated. * Parent worldMatrix was composed with a hardcoded (1, 1, 1) scale. Orthonormalize() strips scale from the local-to-world transform, so we now extract the scale via Gf.Transform.GetScale() *before* orthonormalizing and pass it through to the compose kernel. * Child localMatrix was composed with the empty-array sentinel for the scale slot, leaving the kernel-side scale at the identity default. We now pass the locally-authored scale (already fetched via _usd_view.get_scales()) so the matrix carries the right scale. * Child worldMatrix is still composed from get_world_poses() position and orientation plus the child's local scale, which is wrong when a parent has non-unit world scale. Instead of fixing the seed by hand (would require per-child world-scale lookups), mark the view dirty at the end of the seed. The very next world read fires _sync_world_from_local_if_dirty, which computes child_world = parent_world * child_local on the GPU - and with both matrices now correctly scaled, the multiply produces the right world-space scale automatically. Added test_initial_seed_with_scaled_parent regression test: parent world scale (2, 1, 1), child local scale (3, 1, 1). Locally verified the test fails when either fix is reverted in isolation.

While adding a regression test for the per-view world-dirty flag, I discovered the IFabricHierarchy cache silently misses on every view because ``Stage.GetFabricId()`` returns a fresh ``FabricId`` wrapper on every call, with no value equality between wrappers for the same underlying Fabric. The cache stored (stage_id, wrapper) tuples, so two views on the same stage produced two distinct cache keys and re-fetched the hierarchy on each init. The bug was harmless in practice -- USDRT's ``get_fabric_hierarchy`` itself returns a process-wide singleton per Fabric stage, so both views happened to end up with the same handle anyway -- but the cache wasn't doing the work it was advertised to do. Fix: key the cache on ``(stage_id, fabric_id.id)`` where ``.id`` is the stable ``int`` underneath the wrapper. The new test exercises the multi-view-per-stage scenario: * two FabricFrameView instances on disjoint child prims under different parent sub-trees of one stage * writes on view A must not dirty view B * world reads on view B must not clear view A's dirty flag * both views share one cached IFabricHierarchy and the cache has exactly one entry after both inits (this assertion is what surfaced the FabricId-wrapper bug) * symmetric pass: writes on B must not affect A's post-read state Verified locally that the test fails with both the wrapper-keyed cache (cache size > 1) and with a synthetic stage-shared dirty flag (cross-view stomp). Module-level coverage of fabric_frame_view.py with this test added: 85% line / 78% branch. Remaining uncovered code is the USD-fallback delegations (Fabric disabled), defensive RuntimeError raises, and topology-rebuild branches inside the accessors. 49 tests pass on cpu and cuda:0.

… kernels Replace if/else broadcast branching with wp.where() for branchless predicated selection. wp.select was deprecated in Warp 1.7 and removed in 1.10; wp.where has the more intuitive (cond, true, false) order.

pv-nvidia requested review from kellyguo11, ooctipus and pascal-roth as code owners May 6, 2026 12:41

github-actions Bot added the isaac-lab Related to Isaac Lab team label May 6, 2026

pv-nvidia self-assigned this May 6, 2026

pv-nvidia marked this pull request as draft May 6, 2026 12:41

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py

Comment thread source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py

pv-nvidia force-pushed the pv/5381-rebased branch from 7d135d0 to 71a38d2 Compare May 7, 2026 11:37

greptile-apps Bot reviewed May 7, 2026

View reviewed changes

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py Outdated

isaaclab-review-bot Bot reviewed May 11, 2026

View reviewed changes

pv-nvidia marked this pull request as ready for review May 11, 2026 14:27

greptile-apps Bot reviewed May 11, 2026

View reviewed changes

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py

pv-nvidia added the enhancement New feature or request label May 12, 2026

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py Outdated

pv-nvidia changed the title ~~perf: Implement Fabric-aware get/set_local_poses via indexedfabricarray~~ perf: Implement Fabric-aware get/set_local_poses on FabricFrameView May 12, 2026

pv-nvidia added the bug Something isn't working label May 12, 2026

pv-nvidia force-pushed the pv/5381-rebased branch from bab8ca0 to 5a04061 Compare May 12, 2026 15:10

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Comment thread source/isaaclab_physx/isaaclab_physx/sim/views/fabric_frame_view.py

pv-nvidia requested review from Mayankm96 and jtigue-bdai as code owners May 12, 2026 15:36

github-actions Bot added the documentation Improvements or additions to documentation label May 12, 2026

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

pv-nvidia commented May 13, 2026

View reviewed changes

Comment thread scripts/benchmarks/benchmark_view_comparison.py

pv-nvidia added 5 commits May 13, 2026 21:57

Some cleanup

92d3445

Fixed tests

b7f5158

pv-nvidia added 6 commits May 13, 2026 21:57

pv-nvidia force-pushed the pv/5381-rebased branch from 5b4f91e to bb778db Compare May 13, 2026 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Implement Fabric-aware get/set_local_poses on FabricFrameView#5516

perf: Implement Fabric-aware get/set_local_poses on FabricFrameView#5516
pv-nvidia wants to merge 12 commits into
isaac-sim:developfrom
pv-nvidia:pv/5381-rebased

pv-nvidia commented May 6, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

greptile-apps Bot commented May 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pv-nvidia commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Implementation summary

Transpose-storage convention

Benchmark results

Tests

Iterations during review

Type of change

Checklist

Uh oh!

greptile-apps Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review: Fabric-aware get/set_local_poses + IK/OSC Integration

New Changes in This Push

1. Task-space dynamics accessors (Major Feature)

2. FabricFrameView Major Rework

3. Extensive Test Coverage

4. Newton Dependency Update

5. Dexsuite Config Updates

Code Quality Notes

Previous Fixes (still valid)

Summary

Uh oh!

greptile-apps Bot commented May 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pv-nvidia commented May 6, 2026 •

edited

Loading

greptile-apps Bot commented May 6, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading