isaac-sim
diff --git a/‎docs/source/how-to/save_camera_output.rst‎
Lines changed: 10 additions & 5 deletions b/‎docs/source/how-to/save_camera_output.rst‎
Lines changed: 10 additions & 5 deletions
diff --git a/‎docs/source/overview/core-concepts/sensors/camera.rst‎
Lines changed: 20 additions & 14 deletions b/‎docs/source/overview/core-concepts/sensors/camera.rst‎
Lines changed: 20 additions & 14 deletions
diff --git a/‎scripts/benchmarks/benchmark_cameras.py‎
Lines changed: 10 additions & 8 deletions b/‎scripts/benchmarks/benchmark_cameras.py‎
Lines changed: 10 additions & 8 deletions
diff --git a/‎scripts/demos/sensors/cameras.py‎
Lines changed: 15 additions & 8 deletions b/‎scripts/demos/sensors/cameras.py‎
Lines changed: 15 additions & 8 deletions
diff --git a/‎scripts/tutorials/04_sensors/run_ray_caster_camera.py‎
Lines changed: 7 additions & 6 deletions b/‎scripts/tutorials/04_sensors/run_ray_caster_camera.py‎
Lines changed: 7 additions & 6 deletions
diff --git a/‎scripts/tutorials/04_sensors/run_usd_camera.py‎
Lines changed: 6 additions & 5 deletions b/‎scripts/tutorials/04_sensors/run_usd_camera.py‎
Lines changed: 6 additions & 5 deletions
diff --git a/‎source/isaaclab/config/extension.toml‎
Lines changed: 1 addition & 1 deletion b/‎source/isaaclab/config/extension.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎source/isaaclab/docs/CHANGELOG.rst‎
Lines changed: 61 additions & 0 deletions b/‎source/isaaclab/docs/CHANGELOG.rst‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎source/isaaclab/isaaclab/envs/mdp/observations.py‎
Lines changed: 4 additions & 3 deletions b/‎source/isaaclab/isaaclab/envs/mdp/observations.py‎
Lines changed: 4 additions & 3 deletions
@@ -58,14 +58,19 @@ PyTorch operations which allows faster computation.
 
 .. code-block:: python
 
+   import warp as wp
    from isaaclab.utils.math import transform_points, unproject_depth
 
-   # Pointcloud in world frame
-   points_3d_cam = unproject_depth(
-      camera.data.output["distance_to_image_plane"], camera.data.intrinsic_matrices
-   )
+   # Camera ``data.output`` and pose fields are ``wp.array`` values; lift them to torch
+   # tensors before invoking the torch-based math utilities below.
+   depth = wp.to_torch(camera.data.output["distance_to_image_plane"])
+   intrinsics = wp.to_torch(camera.data.intrinsic_matrices)
+   pos_w = wp.to_torch(camera.data.pos_w)
+   quat_w_ros = wp.to_torch(camera.data.quat_w_ros)
 
-   points_3d_world = transform_points(points_3d_cam, camera.data.pos_w, camera.data.quat_w_ros)
+   # Pointcloud in world frame
+   points_3d_cam = unproject_depth(depth, intrinsics)
+   points_3d_world = transform_points(points_3d_cam, pos_w, quat_w_ros)
 
 Alternately, we can use the :meth:`isaaclab.sensors.camera.utils.create_pointcloud_from_depth` function
 to create a point cloud from the depth image and transform it to the world frame.
 
@@ -162,8 +162,13 @@ Accessing camera data
 
 .. code-block:: python
 
+    import warp as wp
+
     tiled_camera = Camera(cfg.tiled_camera)
-    data = tiled_camera.data.output["rgb"]  # shape: (num_cameras, H, W, 3), torch.uint8
+    # ``data.output`` entries are ``wp.array`` values (e.g. ``wp.uint8``); use
+    # :func:`warp.to_torch` when Torch tensor operations are required.
+    data_wp = tiled_camera.data.output["rgb"]  # shape: (num_cameras, H, W, 3), wp.uint8
+    data = wp.to_torch(data_wp)  # zero-copy torch.uint8 view
 
 The returned data has shape ``(num_cameras, height, width, num_channels)``, ready to use directly
 as an observation in RL training.
@@ -207,11 +212,12 @@ RGB and RGBA
     :figwidth: 100%
     :alt: A scene captured in RGB
 
-``rgb`` returns a 3-channel RGB image of type ``torch.uint8``, shape ``(B, H, W, 3)``.
+``rgb`` returns a 3-channel RGB image of type ``wp.uint8``, shape ``(B, H, W, 3)``.
 
-``rgba`` returns a 4-channel RGBA image of type ``torch.uint8``, shape ``(B, H, W, 4)``.
+``rgba`` returns a 4-channel RGBA image of type ``wp.uint8``, shape ``(B, H, W, 4)``.
 
-To convert to ``torch.float32``, divide by 255.0.
+Use :func:`warp.to_torch` to obtain a zero-copy ``torch.uint8`` view; divide by ``255.0``
+to convert to ``torch.float32``.
 
 Depth and Distances
 ~~~~~~~~~~~~~~~~~~~
@@ -222,10 +228,10 @@ Depth and Distances
     :alt: A scene captured as depth
 
 ``distance_to_camera`` returns a single-channel depth image with distance to the camera optical
-center, shape ``(B, H, W, 1)``, type ``torch.float32``.
+center, shape ``(B, H, W, 1)``, type ``wp.float32``.
 
 ``distance_to_image_plane`` returns distances of 3D points from the camera plane along the Z-axis,
-shape ``(B, H, W, 1)``, type ``torch.float32``.
+shape ``(B, H, W, 1)``, type ``wp.float32``.
 
 ``depth`` is an alias for ``distance_to_image_plane``.
 
@@ -238,14 +244,14 @@ Normals
     :alt: A scene captured with surface normals
 
 ``normals`` returns local surface normal vectors at each pixel, shape ``(B, H, W, 3)`` containing
-``(x, y, z)``, type ``torch.float32``.
+``(x, y, z)``, type ``wp.float32``.
 
 Motion Vectors
 ~~~~~~~~~~~~~~
 
 ``motion_vectors`` returns per-pixel motion vectors in image space between frames.
 Shape ``(B, H, W, 2)``: ``x`` is horizontal motion (positive = left), ``y`` is vertical motion
-(positive = up). Type ``torch.float32``.
+(positive = up). Type ``wp.float32``.
 
 Semantic Segmentation
 ~~~~~~~~~~~~~~~~~~~~~
@@ -259,8 +265,8 @@ Semantic Segmentation
 An ``info`` dictionary is available via ``tiled_camera.data.info['semantic_segmentation']``.
 
 - If ``colorize_semantic_segmentation=True``: 4-channel RGBA image, shape ``(B, H, W, 4)``,
-  type ``torch.uint8``. The ``idToLabels`` dict maps color to semantic label.
-- If ``colorize_semantic_segmentation=False``: shape ``(B, H, W, 1)``, type ``torch.int32``,
+  type ``wp.uint8``. The ``idToLabels`` dict maps color to semantic label.
+- If ``colorize_semantic_segmentation=False``: shape ``(B, H, W, 1)``, type ``wp.int32``,
   containing semantic IDs. The ``idToLabels`` dict maps ID to label.
 
 Instance ID Segmentation
@@ -274,9 +280,9 @@ Instance ID Segmentation
 ``instance_id_segmentation_fast`` outputs per-pixel instance IDs, unique per USD prim path.
 An ``info`` dictionary is available via ``tiled_camera.data.info['instance_id_segmentation_fast']``.
 
-- If ``colorize_instance_id_segmentation=True``: shape ``(B, H, W, 4)``, type ``torch.uint8``.
+- If ``colorize_instance_id_segmentation=True``: shape ``(B, H, W, 4)``, type ``wp.uint8``.
   The ``idToLabels`` dict maps color to USD prim path.
-- If ``colorize_instance_id_segmentation=False``: shape ``(B, H, W, 1)``, type ``torch.int32``.
+- If ``colorize_instance_id_segmentation=False``: shape ``(B, H, W, 1)``, type ``wp.int32``.
   The ``idToLabels`` dict maps instance ID to USD prim path.
 
 Instance Segmentation
@@ -292,8 +298,8 @@ to the lowest level with semantic labels (unlike ``instance_id_segmentation_fast
 goes to the leaf prim).
 An ``info`` dictionary is available via ``tiled_camera.data.info['instance_segmentation_fast']``.
 
-- If ``colorize_instance_segmentation=True``: shape ``(B, H, W, 4)``, type ``torch.uint8``.
-- If ``colorize_instance_segmentation=False``: shape ``(B, H, W, 1)``, type ``torch.int32``.
+- If ``colorize_instance_segmentation=True``: shape ``(B, H, W, 4)``, type ``wp.uint8``.
+- If ``colorize_instance_segmentation=False``: shape ``(B, H, W, 1)``, type ``wp.int32``.
 
 The ``idToLabels`` dict maps color to USD prim path. The ``idToSemantics`` dict maps color to
 semantic label.
@@ -256,6 +256,7 @@
 import numpy as np
 import psutil
 import torch
+import warp as wp
 
 import isaaclab.sim as sim_utils
 from isaaclab.assets import RigidObject, RigidObjectCfg
@@ -635,7 +636,7 @@ def run_simulator(
         # Set camera world poses
         for camera_list in camera_lists:
             for camera in camera_list:
-                num_cameras = camera.data.intrinsic_matrices.size(0)
+                num_cameras = camera.data.intrinsic_matrices.shape[0]
                 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
                 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
                 camera.set_world_poses_from_view(positions, targets)
@@ -675,23 +676,24 @@ def run_simulator(
                     # Only update the camera if it hasn't been updated as part of scene_entities.update ...
                     camera.update(dt=sim.get_physics_dt())
 
+                # camera outputs and intrinsics are wp.array; lift to torch for math + collection
+                intrinsics_torch = wp.to_torch(camera.data.intrinsic_matrices)
+
                 for data_type in data_types:
                     data_label = f"{label}_{cam_idx}_{data_type}"
+                    output_torch = wp.to_torch(camera.data.output[data_type])
 
                     if depth_predicate(data_type):  # is a depth image, want to create cloud
-                        depth = camera.data.output[data_type]
+                        depth = output_torch
                         depth_images[data_label + "_raw"] = depth
                         if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
-                            depth = orthogonalize_perspective_depth(
-                                camera.data.output[data_type], camera.data.intrinsic_matrices
-                            )
+                            depth = orthogonalize_perspective_depth(output_torch, intrinsics_torch)
                             depth_images[data_label + "_undistorted"] = depth
 
-                        pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
+                        pointcloud = unproject_depth(depth=depth, intrinsics=intrinsics_torch)
                         clouds[data_label] = pointcloud
                     else:  # rgb image, just save it
-                        image = camera.data.output[data_type]
-                        images[data_label] = image
+                        images[data_label] = output_torch
 
         # End timing for the step
         step_end_time = time.time()
 
@@ -44,6 +44,7 @@
 import matplotlib.pyplot as plt
 import numpy as np
 import torch
+import warp as wp
 
 import isaaclab.sim as sim_utils
 from isaaclab.assets import ArticulationCfg, AssetBaseCfg
@@ -239,8 +240,15 @@ def run_simulator(sim: sim_utils.SimulationContext, scene: InteractiveScene):
         # save every 10th image (for visualization purposes only)
         # note: saving images will slow down the simulation
         if count % 10 == 0:
+            # camera outputs are wp.array; lift to torch for image saving / matplotlib
+            camera_rgb = wp.to_torch(scene["camera"].data.output["rgb"])
+            tiled_rgb = wp.to_torch(scene["tiled_camera"].data.output["rgb"])
+            camera_depth = wp.to_torch(scene["camera"].data.output["distance_to_image_plane"])
+            tiled_depth = wp.to_torch(scene["tiled_camera"].data.output["distance_to_image_plane"])
+            raycast_depth = wp.to_torch(scene["raycast_camera"].data.output["distance_to_image_plane"])
+
             # compare generated RGB images across different cameras
-            rgb_images = [scene["camera"].data.output["rgb"][0, ..., :3], scene["tiled_camera"].data.output["rgb"][0]]
+            rgb_images = [camera_rgb[0, ..., :3], tiled_rgb[0]]
             save_images_grid(
                 rgb_images,
                 subtitles=["Camera"],
@@ -250,9 +258,9 @@ def run_simulator(sim: sim_utils.SimulationContext, scene: InteractiveScene):
 
             # compare generated Depth images across different cameras
             depth_images = [
-                scene["camera"].data.output["distance_to_image_plane"][0],
-                scene["tiled_camera"].data.output["distance_to_image_plane"][0, ..., 0],
-                scene["raycast_camera"].data.output["distance_to_image_plane"][0],
+                camera_depth[0],
+                tiled_depth[0, ..., 0],
+                raycast_depth[0],
             ]
             save_images_grid(
                 depth_images,
@@ -263,16 +271,15 @@ def run_simulator(sim: sim_utils.SimulationContext, scene: InteractiveScene):
             )
 
             # save all tiled RGB images
-            tiled_images = scene["tiled_camera"].data.output["rgb"]
             save_images_grid(
-                tiled_images,
-                subtitles=[f"Cam{i}" for i in range(tiled_images.shape[0])],
+                tiled_rgb,
+                subtitles=[f"Cam{i}" for i in range(tiled_rgb.shape[0])],
                 title="Tiled RGB Image",
                 filename=os.path.join(output_dir, "tiled_rgb", f"{count:04d}.jpg"),
             )
 
             # save all camera RGB images
-            cam_images = scene["camera"].data.output["rgb"][..., :3]
+            cam_images = camera_rgb[..., :3]
             save_images_grid(
                 cam_images,
                 subtitles=[f"Cam{i}" for i in range(cam_images.shape[0])],
 
@@ -38,6 +38,7 @@
 import os
 
 import torch
+import warp as wp
 
 import omni.replicator.core as rep
 
@@ -144,17 +145,17 @@ def run_simulator(sim: sim_utils.SimulationContext, scene_entities: dict):
             rep_output["trigger_outputs"] = {"on_time": camera.frame[camera_index]}
             rep_writer.write(rep_output)
 
-            # Pointcloud in world frame
-            points_3d_cam = unproject_depth(
-                camera.data.output["distance_to_image_plane"], camera.data.intrinsic_matrices
-            )
+            # Pointcloud in world frame; convert wp.array camera outputs to torch for math ops
+            depth_torch = wp.to_torch(camera.data.output["distance_to_image_plane"])
+            intrinsics_torch = wp.to_torch(camera.data.intrinsic_matrices)
+            points_3d_cam = unproject_depth(depth_torch, intrinsics_torch)
 
             # Check methods are valid
             im_height, im_width = camera.image_shape
             # -- project points to (u, v, d)
-            reproj_points = project_points(points_3d_cam, camera.data.intrinsic_matrices)
+            reproj_points = project_points(points_3d_cam, intrinsics_torch)
             reproj_depths = reproj_points[..., -1].view(-1, im_width, im_height).transpose_(1, 2)
-            sim_depths = camera.data.output["distance_to_image_plane"].squeeze(-1)
+            sim_depths = depth_torch.squeeze(-1)
             torch.testing.assert_close(reproj_depths, sim_depths)
 
 
 
@@ -65,6 +65,7 @@
 
 import numpy as np
 import torch
+import warp as wp
 from isaaclab_physx.renderers import IsaacRtxRendererCfg
 
 import omni.replicator.core as rep
@@ -254,12 +255,12 @@ def run_simulator(sim: sim_utils.SimulationContext, scene_entities: dict):
             and args_cli.draw
             and "distance_to_image_plane" in camera.data.output.keys()
         ):
-            # Derive pointcloud from camera at camera_index
+            # Derive pointcloud from camera at camera_index; lift wp.array fields to torch
             pointcloud = create_pointcloud_from_depth(
-                intrinsic_matrix=camera.data.intrinsic_matrices[camera_index],
-                depth=camera.data.output["distance_to_image_plane"][camera_index],
-                position=camera.data.pos_w[camera_index],
-                orientation=camera.data.quat_w_ros[camera_index],
+                intrinsic_matrix=wp.to_torch(camera.data.intrinsic_matrices)[camera_index],
+                depth=wp.to_torch(camera.data.output["distance_to_image_plane"])[camera_index],
+                position=wp.to_torch(camera.data.pos_w)[camera_index],
+                orientation=wp.to_torch(camera.data.quat_w_ros)[camera_index],
                 device=sim.device,
             )
 
 
@@ -1,7 +1,7 @@
 [package]
 
 # Note: Semantic Versioning is used: https://semver.org/
-version = "4.6.22"
+version = "4.6.24"
 
 # Description
 title = "Isaac Lab framework for Robot Learning"
 
@@ -1,6 +1,67 @@
 Changelog
 ---------
 
+4.6.24 (2026-04-30)
+~~~~~~~~~~~~~~~~~~~
+
+Fixed
+^^^^^
+
+* Fixed :class:`~isaaclab.envs.mdp.observations.image` (and the equivalent
+  ``image()`` helper in the Franka stack ``stack_ik_rel_blueprint`` config) so
+  that Torch tensor operations are applied via :func:`warp.to_torch` rather than
+  invoked directly on the new ``wp.array`` camera outputs.
+* Fixed downstream consumers (camera tutorials, ``demos/sensors/cameras.py``,
+  ``benchmarks/benchmark_cameras.py``, dexsuite ``vision_camera`` observation,
+  visualizer integration test, ``save_camera_output`` how-to and the camera
+  overview docs) to lift ``wp.array`` camera fields to torch tensors via
+  :func:`warp.to_torch` before performing Torch operations.
+* Fixed the camera and ray-caster camera test suites to use ``wp.array`` dtypes
+  (``wp.uint8``, ``wp.float32``, ``wp.int32``) and :func:`warp.to_torch` views
+  in assertions on ``CameraData.output``, ``CameraData.intrinsic_matrices``,
+  ``CameraData.pos_w``/``quat_w_*`` and ``Camera.frame``.
+* Fixed :class:`~isaaclab.renderers.NewtonWarpRenderer` to populate the ``rgb``
+  output buffer when both ``rgb`` and ``rgba`` are requested, restoring the
+  legacy "rgb mirrors rgba" behavior that broke when ``rgb`` and ``rgba``
+  became independent ``wp.array`` allocations.
+
+Changed
+^^^^^^^
+
+* Tightened :class:`~isaaclab.renderers.RenderBufferSpec` ``dtype`` annotation
+  from ``Any`` to ``type`` to document that all renderers must publish Warp
+  scalar dtype classes (e.g. ``warp.uint8``).
+* Removed the transitional ``torch.dtype → wp.dtype`` shim in
+  :meth:`~isaaclab.sensors.camera.CameraData.allocate` now that all in-tree
+  renderers publish ``wp`` dtypes via :class:`~isaaclab.renderers.RenderBufferSpec`.
+* Documented the transitional Torch input fallback on
+  :func:`~isaaclab.utils.math.convert_camera_frame_orientation_convention` and
+  consolidated the redundant ``wp ↔ torch`` round-trips in
+  :meth:`~isaaclab.sensors.camera.Camera.set_world_poses` and
+  :meth:`~isaaclab.sensors.camera.Camera.set_world_poses_from_view`.
+
+
+4.6.23 (2026-04-30)
+~~~~~~~~~~~~~~~~~~~
+
+Changed
+^^^^^^^
+
+* Changed :class:`~isaaclab.sensors.camera.CameraData` array fields and
+  :attr:`~isaaclab.sensors.camera.CameraData.output` buffers to expose
+  ``wp.array`` values instead of :class:`torch.Tensor` values. Use
+  :func:`warp.to_torch` where Torch tensor operations are required.
+* Changed :class:`~isaaclab.sensors.camera.Camera` pose, intrinsic, and frame
+  array APIs to accept or return ``wp.array`` values instead of
+  :class:`torch.Tensor` values. Existing Torch inputs are still accepted during
+  the transition; prefer ``wp.array`` at public call sites.
+* Changed :class:`~isaaclab.renderers.BaseRenderer` output and camera-update
+  APIs to exchange ``wp.array`` buffers with camera sensors.
+* Changed :func:`~isaaclab.utils.math.convert_camera_frame_orientation_convention`
+  to accept and return ``wp.array`` quaternion arrays. Use :func:`warp.to_torch`
+  where Torch tensor operations are required.
+
+
 4.6.22 (2026-04-27)
 ~~~~~~~~~~~~~~~~~~~
 
 
@@ -14,6 +14,7 @@
 from typing import TYPE_CHECKING
 
 import torch
+import warp as wp
 
 import isaaclab.utils.math as math_utils
 from isaaclab.managers import SceneEntityCfg
@@ -402,12 +403,12 @@ def image(
     # extract the used quantities (to enable type-hinting)
     sensor: Camera | RayCasterCamera = env.scene.sensors[sensor_cfg.name]
 
-    # obtain the input image
-    images = sensor.data.output[data_type]
+    # obtain the input image; camera outputs are wp.array, lift to torch for tensor ops
+    images = wp.to_torch(sensor.data.output[data_type])
 
     # depth image conversion
     if (data_type == "distance_to_camera") and convert_perspective_to_orthogonal:
-        images = math_utils.orthogonalize_perspective_depth(images, sensor.data.intrinsic_matrices)
+        images = math_utils.orthogonalize_perspective_depth(images, wp.to_torch(sensor.data.intrinsic_matrices))
 
     # rgb/depth/normals image normalization
     if normalize: