You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What change is being made
Update exporter tutorial. Fix link.
### Why this change is being made
N/A
### Tested
N/A
GitOrigin-RevId: 2d1bd16fb21e4b9a31256c27ad0d1ecf2cb57d52
Copy file name to clipboardExpand all lines: docs/tutorial/exporter/exporter_tutorial.md
+44-25Lines changed: 44 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -396,55 +396,74 @@ This traces all registered torch operations — observation computation, the act
396
396
processing — and writes them into a single ONNX graph. The resulting file contains the full
397
397
computation pipeline from raw inputs to commanded outputs, along with the metadata you registered.
398
398
399
-
---
399
+
### Inspecting and understanding the generated ONNX files
400
+
When exporting an environment and its actor, Exploy produces `policy.onnx` (or whatever filename
401
+
is passed to `export_environment_as_onnx()`). That exported file can then be used for evaluation
402
+
(see [Putting It All Together](#putting-it-all-together)) or for deployment in a controller
403
+
(see [Controller Tutorial](../controller/controller_tutorial.md)).
404
+
405
+
The export contains two execution paths by design: **Default** and **ProcessActions**. The
406
+
**Default** path runs at the policy rate and includes observation computation plus actor inference,
407
+
while the **ProcessActions** path runs at the simulation rate and captures the computational graph
408
+
for action post-processing and command application that occurs at every simulation sub-step. This
409
+
separation preserves the original control timing (policy step vs. simulation sub-step) in the
410
+
exported model.
411
+
412
+
Additionally, Exploy will produce debug computational graphs. These are only to be used for
413
+
debugging and visual inspection.
414
+
415
+
-`debug/policy_default.onnx`: The `Default` path of the computational graph.
416
+
-`debug/policy_process_actions.onnx`: The `ProcessActions` path of the computational graph.
417
+
-`debug/policy_optimized.onnx`: The optimized version of the exported ONNX model.
418
+
419
+
420
+
> **Note:**`debug/policy_optimized.onnx` is generated when `optimize=True` is passed to
421
+
> {py:class}`SessionWrapper <exploy.exporter.core.session_wrapper.SessionWrapper>`. While fully
422
+
> functional, it requires the same ONNX Runtime version to be used in both the exporter and the
423
+
> controller. This constraint depends on the user's setup.
424
+
> For deployment in a controller, only the `policy.onnx` file should be used.
400
425
401
-
## Visualizing the Exported ONNX Graph
426
+
427
+
### Visualizing the Exported ONNX Graph
402
428
403
429
You can inspect the exported ONNX file using [Netron](https://github.com/lutzroeder/netron), an
404
430
open-source viewer for neural network models. The screenshots below show the computational graphs
405
-
produced by the export steps in this tutorial.
406
-
407
-
### Overview
431
+
produced by the export steps in this tutorial. The [unit test](https://github.com/bdaiinstitute/exploy/blob/main/python/exploy/exporter/core/tests/test_export_environment.py) runs on three different environments:
432
+
- an environment that computes observations and uses an MLP actor
433
+
- an environment that uses a torch module to compute parts of its observations and uses an MLP actor
434
+
- an environment that uses a torch module to compute parts of its observations and uses an RNN-based actor
408
435
409
-
The exported ONNX file contains two subgraphs: a **Default** graph that computes observations and
436
+
Each exported ONNX file contains two subgraphs: a **Default** graph that computes observations and
410
437
actions at the policy rate, and a **ProcessActions** graph that maps raw actions to commanded
411
438
outputs at the simulation rate.
412
439
413
-

414
440
415
-
###Default graph (basic environment)
441
+
#### Environment and MLP Actor
416
442
417
443
This is the full default graph for the basic `Environment` from Step 1. You can see the named
418
-
inputs (`foo`, `bar`, `baz`, `memory.actions.in`) flowing through the observation computation
419
-
and into the actor network, which produces actions and the named outputs (`out`,
420
-
`memory.actions.out`).
421
-
422
-

444
+
inputs (`foo`), the group inputs (`bar`, `baz`), and the memory inputs (`memory.actions.in`) flowing
445
+
through the observation computation and into the actor network, which produces actions and the named
446
+
outputs (`out`, `memory.actions.out`). The ProcessActions subgraph traces only `process_actions()`
447
+
and `apply_actions()`. It runs at the simulation time-step rate (i.e., once per decimation sub-step)
448
+
and maps actions to commanded outputs.
423
449
424
-
### ProcessActions subgraph
450
+

425
451
426
-
The ProcessActions subgraph traces only `process_actions()` and `apply_actions()`. It runs at the
427
-
simulation time-step rate (i.e., once per decimation sub-step) and maps actions to commanded
0 commit comments