|
| 1 | +# PlantSimEngine Agent And Developer Guide |
| 2 | + |
| 3 | +This file is the maintainer-facing summary of how PlantSimEngine works internally. |
| 4 | +It is meant for humans and coding agents making changes to the package. |
| 5 | + |
| 6 | +PlantSimEngine is a Julia engine for composing process models on either: |
| 7 | + |
| 8 | +- a single shared status (`ModelMapping{SingleScale}` / legacy `ModelList`) |
| 9 | +- a multiscale MTG scene (`GraphSimulation`) |
| 10 | + |
| 11 | +The package is built around four ideas: |
| 12 | + |
| 13 | +1. Models declare `inputs_`, `outputs_`, and optionally `dep`. |
| 14 | +2. The engine compiles a dependency graph from those declarations. |
| 15 | +3. Runtime state is reference-based (`Status`, `RefVector`), so coupling is often aliasing, not copying. |
| 16 | +4. Multiscale and multirate configuration can change where an input comes from, how it is transported, and when it is sampled. |
| 17 | + |
| 18 | +## What The Package Supports |
| 19 | + |
| 20 | +- Single-scale process composition with automatic soft-dependency inference. |
| 21 | +- Hard dependencies declared explicitly and called manually from model code. |
| 22 | +- MTG-based multiscale simulations with cross-scale variable mappings. |
| 23 | +- Cross-scale scalar sharing through shared `Ref`s. |
| 24 | +- Cross-scale multi-node sharing through `RefVector`s. |
| 25 | +- Cross-scale writes, where a variable computed at one scale is materialized as an input at another scale. |
| 26 | +- Same-scale variable aliasing and renaming. |
| 27 | +- Cycle breaking through `PreviousTimeStep`. |
| 28 | +- Multi-rate execution through `ModelSpec`, `ClockSpec`, and temporal policies. |
| 29 | +- Explicit or inferred `InputBindings` between producers and consumers. |
| 30 | +- Meteo resampling/aggregation per model in multi-rate MTG runs. |
| 31 | +- Output routing (`:canonical` vs `:stream_only`) and online output export (`OutputRequest`). |
| 32 | +- Parallel single-scale execution when model traits allow it. |
| 33 | + |
| 34 | +## Core Runtime Objects |
| 35 | + |
| 36 | +### Processes and models |
| 37 | + |
| 38 | +- All models subtype `AbstractModel`. |
| 39 | +- `@process` creates an abstract process type such as `AbstractGrowthModel`. |
| 40 | +- Process identity comes from the abstract process type, not the concrete model name. |
| 41 | +- The model execution contract is: |
| 42 | + |
| 43 | +```julia |
| 44 | +PlantSimEngine.run!(model, models, status, meteo, constants, extra) |
| 45 | +``` |
| 46 | + |
| 47 | +- `inputs_(model)` and `outputs_(model)` are the authoritative declarations. |
| 48 | +- `variables(model)` is `merge(inputs_(model), outputs_(model))`. |
| 49 | +- Do not rely on a variable being both an input and an output under the same name: `merge` means the later declaration wins. |
| 50 | + |
| 51 | +### Status |
| 52 | + |
| 53 | +- `Status` is a wrapper around a `NamedTuple` of `Ref`s. |
| 54 | +- Reading a field dereferences it. Writing a field mutates the underlying `Ref`. |
| 55 | +- This aliasing behavior is intentional and is the basis of most coupling. |
| 56 | +- In single-scale runs, vector-valued user inputs are flattened to one timestep value and updated per timestep with `set_variables_at_timestep!`. |
| 57 | + |
| 58 | +### RefVector |
| 59 | + |
| 60 | +- `RefVector` is an `AbstractVector` of `Base.RefValue`s. |
| 61 | +- It is used when one model input must see a vector of references coming from many statuses. |
| 62 | +- Reading a `RefVector` dereferences each underlying status cell. |
| 63 | +- Writing into a `RefVector` mutates the source statuses. |
| 64 | +- `RefVector` order follows MTG traversal order during initialization, not a semantic plant order. |
| 65 | + |
| 66 | +### Mapping wrappers |
| 67 | + |
| 68 | +- `MultiScaleModel` wraps one model plus a multiscale mapping declaration. |
| 69 | +- `ModelSpec` wraps one model plus scenario-level runtime configuration: |
| 70 | + `multiscale`, `timestep`, `input_bindings`, `meteo_bindings`, `meteo_window`, `output_routing`, and `scope`. |
| 71 | +- `ModelMapping` is the normalized mapping container used by current entry points. |
| 72 | +- Legacy `ModelList` still exists, but it is compatibility plumbing and should not be treated as the main abstraction for new work. |
| 73 | + |
| 74 | +### Simulation wrappers |
| 75 | + |
| 76 | +- `DependencyGraph` holds root dependency nodes plus unresolved dependencies. |
| 77 | +- `GraphSimulation` holds the MTG, statuses, status templates, reverse mappings, dependency graph, models, model specs, outputs, and temporal state. |
| 78 | + |
| 79 | +## Dependency Graph Under The Hood |
| 80 | + |
| 81 | +### Hard dependencies |
| 82 | + |
| 83 | +- Hard dependencies are declared with `dep(::ModelType)`. |
| 84 | +- A hard dependency means: "this model directly calls another model from inside its own `run!` implementation." |
| 85 | +- Hard dependencies are represented by `HardDependencyNode`. |
| 86 | +- They are executed manually by the parent model. The runtime does not automatically recurse into hard dependencies. |
| 87 | +- Hard dependencies can be same-scale or explicitly multiscale. |
| 88 | + |
| 89 | +Important nuance: |
| 90 | + |
| 91 | +- A hard dependency does not become an independent soft-dependency node under the parent. |
| 92 | +- But it still matters for graph construction, because the graph compiler aggregates the root model's hard-dependency subtree when computing that root's effective inputs and outputs. |
| 93 | +- In multiscale graph building, if another model depends on a process that exists only as a nested hard dependency, the code resolves that dependency back to the master soft node that owns that hard subtree. |
| 94 | + |
| 95 | +So "hard dependencies do not directly participate in the soft graph" is true for execution structure, but false if interpreted as "their IO is irrelevant to graph compilation." |
| 96 | + |
| 97 | +### Soft dependencies |
| 98 | + |
| 99 | +- Soft dependencies are inferred by matching model inputs against outputs. |
| 100 | +- Matching is name-based after variable flattening, not based on a richer semantic contract. |
| 101 | +- Same-scale soft dependencies are built after hard-dependency trees are known. |
| 102 | +- A process cannot also list one of its hard dependencies as a soft dependency. |
| 103 | +- `PreviousTimeStep` variables are removed from current-step soft dependency inference. |
| 104 | +- Soft dependencies are represented by `SoftDependencyNode`. |
| 105 | +- A soft node may have multiple parents. |
| 106 | +- A node is considered runnable once all of its parent nodes have already run for the current traversal. |
| 107 | +- If no producer output matches an input, no soft edge is added. Soft-edge construction does not itself fail on missing producers. |
| 108 | + |
| 109 | +### Single-scale graph build |
| 110 | + |
| 111 | +Single-scale graph construction is: |
| 112 | + |
| 113 | +1. Build `HardDependencyNode`s for each declared process. |
| 114 | +2. Attach explicit hard-dependency children under their parents. |
| 115 | +3. Traverse each hard-dependency root and collect its effective inputs and outputs. |
| 116 | +4. Build one `SoftDependencyNode` per hard-dependency root. |
| 117 | +5. Infer parent and child links by matching inputs to outputs. |
| 118 | + |
| 119 | +### Multiscale graph build |
| 120 | + |
| 121 | +Multiscale graph construction is more involved: |
| 122 | + |
| 123 | +1. Normalize the user mapping into `ModelMapping`. |
| 124 | +2. Build per-scale hard-dependency graphs. |
| 125 | +3. Resolve multiscale hard dependencies declared across scales. |
| 126 | +4. Compute per-scale effective inputs and outputs for each hard-dependency root. |
| 127 | +5. Build one `SoftDependencyNode` per root process per scale. |
| 128 | +6. Compile mapped variables and reverse mappings. |
| 129 | +7. Infer same-scale soft dependencies. |
| 130 | +8. Infer cross-scale soft dependencies from mapped variables and reverse mappings. |
| 131 | +9. If a dependency points to a nested hard dependency, redirect it to the owning soft node. |
| 132 | +10. Check the final graph for cycles. |
| 133 | + |
| 134 | +### Cycle handling |
| 135 | + |
| 136 | +- The graph is expected to be acyclic. |
| 137 | +- The official way to break a same-step cycle is `PreviousTimeStep`. |
| 138 | +- `PreviousTimeStep` breaks cycles by suppressing current-step edge creation, not by adding special scheduler logic. |
| 139 | +- In multiscale runs, cycle detection happens after the cross-scale graph is assembled. |
| 140 | +- Single-scale `dep(...)` relies mostly on builder-time guards. Multiscale `dep(mapping)` also runs an explicit global cycle check on the final soft graph. |
| 141 | + |
| 142 | +## Multiscale Mapping Model |
| 143 | + |
| 144 | +### Mapping modes |
| 145 | + |
| 146 | +PlantSimEngine distinguishes three mapping modes: |
| 147 | + |
| 148 | +- `SingleNodeMapping(scale)`: one scalar value is read from one source scale. |
| 149 | +- `MultiNodeMapping(scales)`: one input reads a vector of values from many source nodes. |
| 150 | +- `SelfNodeMapping()`: a source scale must expose a scalar reference to itself so other scales can share it. |
| 151 | + |
| 152 | +The runtime carrier is `MappedVar`, which stores: |
| 153 | + |
| 154 | +- the mapping mode |
| 155 | +- the local variable name |
| 156 | +- the source variable name |
| 157 | +- the resolved default value |
| 158 | + |
| 159 | +### Supported mapping forms |
| 160 | + |
| 161 | +These are the important user-level forms and what they become internally: |
| 162 | + |
| 163 | +| User form | Meaning | Runtime shape | |
| 164 | +| --- | --- | --- | |
| 165 | +| `:x => :Plant` | scalar read from one `:Plant` node | shared `Ref` | |
| 166 | +| `:x => (:Plant => :y)` | scalar read with renaming | shared `Ref` | |
| 167 | +| `:x => [:Leaf]` | vector read from all `:Leaf` nodes | `RefVector` | |
| 168 | +| `:x => [:Leaf, :Internode]` | vector read from several scales | `RefVector` | |
| 169 | +| `:x => [:Leaf => :a, :Internode => :b]` | vector read with per-scale renaming | `RefVector` | |
| 170 | +| `PreviousTimeStep(:x) => ...` | lagged mapping, excluded from same-step dependency build | lagged input | |
| 171 | +| `PreviousTimeStep(:x)` | pure cycle-breaking marker | local/default value | |
| 172 | +| `:x => (Symbol(\"\") => :y)` | same-scale rename | `RefVariable` alias | |
| 173 | + |
| 174 | +### Mapping compilation pipeline |
| 175 | + |
| 176 | +`mapped_variables(...)` does not just mirror user syntax. It compiles it. |
| 177 | + |
| 178 | +The main passes are: |
| 179 | + |
| 180 | +1. Start from effective per-scale inputs and outputs collected from hard-dependency roots. |
| 181 | +2. Add variables that are outputs of one scale but must appear as inputs at another scale. |
| 182 | +3. Convert scalar cross-scale reads into self-mapped outputs on the source scale so one shared `Ref` exists. |
| 183 | +4. Resolve default values recursively back to the ultimate producer. |
| 184 | +5. Convert mapping descriptors into runtime carriers: |
| 185 | + - scalar mappings become shared `Ref`s |
| 186 | + - multi-node mappings become empty `RefVector`s |
| 187 | + - same-scale renames become `RefVariable` |
| 188 | + |
| 189 | +### Reverse mapping and status wiring |
| 190 | + |
| 191 | +- Reverse mapping is computed before the reference conversion pass. |
| 192 | +- Reverse mapping answers: "when a source node is initialized, which target scale/vector inputs should receive a reference to this source variable?" |
| 193 | +- Reverse mapping excludes scalar `SingleNodeMapping` edges when `all=false`, because scalar sharing is already handled by shared `Ref`s. |
| 194 | + |
| 195 | +During `init_node_status!`: |
| 196 | + |
| 197 | +1. A copy of the scale template is made. |
| 198 | +2. `:node => Ref(node)` is injected. |
| 199 | +3. Remaining uninitialized variables may be filled from MTG attributes. |
| 200 | +4. The template becomes a `Status`. |
| 201 | +5. The status is pushed into `statuses[scale]`. |
| 202 | +6. If this node feeds any downstream `RefVector`, its `Ref`s are pushed into those target vectors. |
| 203 | +7. The status is stored on the MTG node under `:plantsimengine_status`. |
| 204 | + |
| 205 | +### Copies vs references |
| 206 | + |
| 207 | +- MTG attribute initialization copies plain values into the status. |
| 208 | +- If the MTG attribute itself is already a `Ref`, that `Ref` is preserved. |
| 209 | +- The runtime cannot create a live reference directly into a dict-backed MTG attribute. |
| 210 | +- Cross-scale sharing is reference-based once the status exists. |
| 211 | + |
| 212 | +## Multi-Rate Runtime |
| 213 | + |
| 214 | +Multi-rate behavior is layered on top of the multiscale MTG runtime. |
| 215 | + |
| 216 | +### Timing and policies |
| 217 | + |
| 218 | +- `timespec(model)` defines the model's default clock. The default is `ClockSpec(1.0, 0.0)`. |
| 219 | +- `ModelSpec.timestep` can override runtime clock selection. |
| 220 | +- `output_policy(model)` declares per-output temporal policy defaults. |
| 221 | + |
| 222 | +Supported schedule policies are: |
| 223 | + |
| 224 | +- `HoldLast()`: use the latest available producer value. |
| 225 | +- `Interpolate()`: interpolate or hold/extrapolate producer streams. |
| 226 | +- `Integrate()`: reduce values over the consumer window, default reducer is `SumReducer()`. |
| 227 | +- `Aggregate()`: reduce values over the consumer window, default reducer is `MeanReducer()`. |
| 228 | + |
| 229 | +### ModelSpec configuration surface |
| 230 | + |
| 231 | +`ModelSpec` is the configuration point for scenario-specific runtime behavior. |
| 232 | + |
| 233 | +It can define: |
| 234 | + |
| 235 | +- `multiscale`: mapping declaration |
| 236 | +- `timestep`: runtime clock |
| 237 | +- `input_bindings`: explicit producer selection for consumer inputs |
| 238 | +- `meteo_bindings`: per-model weather aggregation |
| 239 | +- `meteo_window`: weather window selection strategy |
| 240 | +- `output_routing`: `:canonical` or `:stream_only` |
| 241 | +- `scope`: `:global`, `:self`, `:plant`, `:scene`, `ScopeId`, or callable |
| 242 | + |
| 243 | +### Input binding inference |
| 244 | + |
| 245 | +- If explicit `InputBindings` are absent, the package tries to infer bindings from the dependency graph and mapping. |
| 246 | +- Unique same-scale producers win first. |
| 247 | +- Unique cross-scale producers are accepted when unambiguous. |
| 248 | +- Existing multiscale mapping hints can disambiguate some cross-scale cases. |
| 249 | +- Ambiguity is an error and must be resolved explicitly. |
| 250 | + |
| 251 | +### Runtime sequence in multi-rate MTG mode |
| 252 | + |
| 253 | +For each dependency node and each status at that node's scale: |
| 254 | + |
| 255 | +1. Decide whether the model should run at the current time according to its clock. |
| 256 | +2. Resolve consumer inputs from temporal state with explicit or inferred bindings. |
| 257 | +3. Sample or aggregate meteo for the model. |
| 258 | +4. Call the model's `run!`. |
| 259 | +5. Publish outputs back into temporal caches and streams. |
| 260 | +6. Materialize any requested online exports. |
| 261 | + |
| 262 | +Important consequences: |
| 263 | + |
| 264 | +- In non-multirate MTG runs, cross-scale coupling is mostly direct aliasing through shared refs. |
| 265 | +- In multirate MTG runs, temporal state can overwrite consumer inputs just before execution. |
| 266 | +- Multi-rate MTG runs are currently forced to sequential execution. |
| 267 | + |
| 268 | +## Configurations Developers Must Keep In Mind |
| 269 | + |
| 270 | +A variable seen by a model may be in any of these supported configurations: |
| 271 | + |
| 272 | +- Plain local status value initialized by the user. |
| 273 | +- Plain local status value initialized from MTG node attributes. |
| 274 | +- Output computed locally at the same scale. |
| 275 | +- Same-scale alias of another local variable through `RefVariable`. |
| 276 | +- Scalar value mapped from another scale through a shared `Ref`. |
| 277 | +- Vector of references mapped from one or many other scales through `RefVector`. |
| 278 | +- Output computed at one scale and written into another scale, which means it is injected as an input on the receiving scale during mapping compilation. |
| 279 | +- Value marked as `PreviousTimeStep`, which removes it from same-step dependency inference. |
| 280 | +- Input resolved from a hard dependency that is called manually inside another model. |
| 281 | +- Input resolved from temporal streams instead of directly from the current status value. |
| 282 | +- Input bound explicitly with `InputBindings`. |
| 283 | +- Input bound implicitly by inference from producers and mappings. |
| 284 | +- Input sampled with `HoldLast`, `Interpolate`, `Integrate`, or `Aggregate`. |
| 285 | +- Output published canonically into status state. |
| 286 | +- Output published as `:stream_only`, meaning it participates in temporal streams but not canonical output ownership. |
| 287 | +- Value partitioned by scope (`:global`, `:self`, `:plant`, `:scene`, or custom scope function). |
| 288 | + |
| 289 | +When changing dependency, mapping, or runtime code, assume all of these modes can exist in the same simulation. |
| 290 | + |
| 291 | +## Execution Semantics And Important Caveats |
| 292 | + |
| 293 | +- Soft-dependency order controls model order. MTG topology does not define execution order within a scale. |
| 294 | +- Within one scale, execution order follows the order of `statuses[scale]`, which comes from MTG traversal at initialization time. |
| 295 | +- `SingleNodeMapping` assumes the source node is unique at runtime. The mapping layer does not enforce uniqueness. |
| 296 | +- `RefVector` ordering is traversal order, not a guaranteed biological ordering. |
| 297 | +- Hard dependencies are manual calls. If model code stops calling them, the declared hard dependency no longer executes. |
| 298 | +- Hard dependencies still influence graph compilation through their effective inputs and outputs. |
| 299 | +- Multiscale redirection from nested hard dependencies back to the owning soft node is implemented with upward walking through parent links and a defensive depth guard. Treat that path as fragile. |
| 300 | +- MTG topology changes after `init_statuses` leave `statuses`, node attributes, and populated `RefVector`s stale. Reinitialize after topology changes. |
| 301 | +- Same-scale renaming does not create a graph-wide shared ref. It creates a per-status alias. |
| 302 | +- `parent_vars` is dependency metadata, not a full provenance graph, and in multiscale builds it can be overwritten when a node has both same-scale and cross-scale parents. |
| 303 | +- Duplicate canonical publishers for one `(scale, variable)` are invalid in multi-rate mode unless non-canonical producers are marked `:stream_only`. |
| 304 | +- User `extra` arguments are not allowed in MTG runs because `GraphSimulation` already occupies that slot. |
| 305 | +- String scale names still work in many places but are deprecated. Prefer `Symbol` scales. |
| 306 | +- `ModelList` is deprecated as the primary API. Prefer `ModelMapping`. |
| 307 | +- `run_node_multiscale!` currently uses `node.simulation_id[1]` as the visitation guard. Treat that code carefully if you touch traversal semantics. |
| 308 | +- Some variable collection helpers use set-like flattening, so collection order is not always stable. Do not attach semantics to incidental variable ordering. |
| 309 | + |
| 310 | +## High-Signal Files |
| 311 | + |
| 312 | +- `src/PlantSimEngine.jl`: module layout and exports. |
| 313 | +- `src/Abstract_model_structs.jl`: `AbstractModel` and `process`. |
| 314 | +- `src/processes/process_generation.jl`: `@process`. |
| 315 | +- `src/processes/models_inputs_outputs.jl`: model declarations and runtime traits. |
| 316 | +- `src/variables_wrappers.jl`: `UninitializedVar`, `PreviousTimeStep`, `RefVariable`. |
| 317 | +- `src/component_models/Status.jl`: reference-based status container. |
| 318 | +- `src/component_models/RefVector.jl`: vector of references. |
| 319 | +- `src/dependencies/*`: hard and soft dependency graph construction and traversal. |
| 320 | +- `src/mtg/MultiScaleModel.jl`: mapping syntax normalization. |
| 321 | +- `src/mtg/ModelSpec.jl`: runtime configuration wrapper. |
| 322 | +- `src/mtg/mapping/*`: mapping compilation, reverse mapping, initialization helpers. |
| 323 | +- `src/mtg/initialisation.jl`: status creation and MTG wiring. |
| 324 | +- `src/mtg/GraphSimulation.jl`: simulation wrapper. |
| 325 | +- `src/time/multirate.jl`: clocks, policies, temporal storage types. |
| 326 | +- `src/time/runtime/*`: input resolution, scopes, publishers, meteo sampling, output export. |
| 327 | +- `src/run.jl`: single-scale and multiscale execution. |
| 328 | + |
| 329 | +## Practical Rule For Future Changes |
| 330 | + |
| 331 | +If you change dependency, mapping, or runtime behavior, re-check all of these questions: |
| 332 | + |
| 333 | +1. Does it still work for both single-scale and MTG runs? |
| 334 | +2. Does it preserve aliasing semantics for `Status` and `RefVector`? |
| 335 | +3. Does it preserve the distinction between hard dependencies and soft dependencies? |
| 336 | +4. Does it still handle scalar mappings, vector mappings, same-scale aliasing, and cross-scale writes? |
| 337 | +5. Does it still behave correctly with `PreviousTimeStep`? |
| 338 | +6. Does it still work when input bindings are inferred instead of explicit? |
| 339 | +7. Does it still work in multi-rate mode with temporal policies and scoped streams? |
| 340 | +8. Does it remain correct if the producer is nested under a hard dependency? |
0 commit comments