-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathllms-full.txt
More file actions
2172 lines (1530 loc) · 115 KB
/
llms-full.txt
File metadata and controls
2172 lines (1530 loc) · 115 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Agent Runtime full documentation for LLMs
Source root: https://limecloud.github.io/agentruntime
Agent Runtime is a portable contract for observable, controllable agent execution.
# What is Agent Runtime?
Source: https://limecloud.github.io/agentruntime/en/what-is-agent-runtime
# What is Agent Runtime?
Agent Runtime defines how agent work is accepted, executed, observed, controlled, resumed, and audited. It is the layer below Agent UI and above concrete model providers, tool systems, context stores, artifact services, and host application storage.
Use Agent Runtime when a product needs stable semantics for:
- submitted user turns and agent tasks with attempts, progress, graph edges, and delivery state
- model routing, fallback, limits, and token/cost accounting
- streaming text, reasoning, and structured output
- tool calls, tool results, large output refs, and tool errors
- human approval, structured input, interruption, and resume
- queues, steering, long-running turns, and subagents
- context assembly, memory retrieval, compaction, and missing context
- artifact refs, evidence refs, replay cases, and review exports
Do not use it to define the visual interface, model provider API, connector protocol, business database schema, artifact file format, or evidence review policy. Those systems remain adjacent owners.
## Layer map
| Layer | Main question | Runtime facts |
| --- | --- | --- |
| `input` | What work was submitted and by whom? | session, thread, turn, draft, attachments, source channel, request ids. |
| `execution` | What is running and why? | turn status, task lifecycle, task attempts, model routing, tool calls, action requests, subagents. |
| `state` | What can be resumed or inspected later? | snapshots, thread read model, queue, pending requests, incidents, checkpoints. |
| `coordination` | What external systems were used? | tool inventory, context refs, artifact refs, evidence refs, policy decisions. |
| `observability` | Can the work be traced, replayed, reviewed, or audited? | trace ids, spans, timeline, evidence pack, replay case, verification summaries. |
The runtime may be embedded in a desktop app, hosted behind an HTTP API, run in a worker, or coordinate local and remote agents. The standard constrains facts and control semantics, not deployment shape.
# Specification
Source: https://limecloud.github.io/agentruntime/en/specification
# Specification
Agent Runtime latest draft is a portable standard draft for agent execution. The core contract is the boundary between execution facts and consumers such as UI, replay, review, telemetry, workflow, and remote channels.
Agent Runtime owns execution facts. It does not own the visual surface, provider API, external tool protocol, artifact bytes, evidence verdict, memory source, or host account model.
## Profiles and conformance
Agent Runtime is specified in layers:
| Layer | Stability | Meaning |
| --- | --- | --- |
| Public core | Portable | Required identities, event envelope, control-plane semantics, snapshots, and ownership rules that can apply across products. |
| Recommended extensions | Portable but optional | Permission, sandbox, hooks, process, routing, task graph, subagent, job, channel, history, and evidence families used by rich agent runtimes. |
| Product profile | Implementation-specific | A stricter subset and payload shape selected by a product or host. Profiles may require ids, fields, fixtures, and validation beyond the public core. |
A profile MUST NOT redefine ownership. It may only make the public model stricter, add typed payload fields, or require specific event families. The [Lime AgentRuntime Profile](/en/profiles/lime) is the reference product profile for Lime current runtime.
## Scope
Agent Runtime standardizes these implementation concerns:
1. Runtime identity and correlation ids.
2. Event classes and event envelope fields.
3. Control plane actions and required write boundaries.
4. Durable snapshots and read models.
5. Tool/context/model/policy orchestration facts.
6. Human-in-the-loop requests and queue/resume semantics.
7. Evidence, replay, and observability export boundaries.
8. Permission, sandbox, hooks, process execution, remote channel recovery, and peer task mapping.
9. Model routing, candidate sets, cost, quota, rate limit, and budget facts.
10. Agent task lifecycle, attempts, task graphs, subagent graphs, background jobs, large output storage, and session reconstruction.
11. Benchmark trial instrumentation for dataset/config/task correlation, trajectories, rewards, and comparison decisions.
Agent Runtime does **not** standardize a UI component model, model provider protocol, tool registry format, workflow language, vector store, artifact format, or observability backend.
## Pressure From Real Runtimes
Agent Runtime is not a wrapper around chat streaming. Real implementations show ten facts that must be first-class:
1. Tool calls have schema, progress, partial output, permission gates, hooks, result refs, and failure categories.
2. Command execution has cwd, sandbox, network, stdin/stdout, exit code, output buffers, and long-running process state.
3. Permission decisions come from modes, rules, hooks, classifiers, humans, and host policy. Deny/ask rules must be able to override automatic allow.
4. Hooks are governance points and must write runtime facts, not create a side execution path.
5. Context compaction, rollback, and reconstruction need explicit boundaries.
6. Subagents need a parent-child graph, isolation, status, and recoverable child threads.
7. Tasks need objective, owner, status, attempts, dependencies, progress, output refs, and delivery state; todo lists are not enough.
8. Jobs need item status, attempts, assignment, and progress.
9. Remote channels need identity, native peer ids, resume cursors, permission bridges, and disconnect semantics.
10. Model routing needs task profiles, candidate sets, decisions, fallback, single-candidate, and no-candidate facts.
11. Cost, quota, rate limits, request telemetry, and evidence must join through stable correlation ids.
## Execution architecture
```mermaid
flowchart TB
Input[Client / channel / workflow input] --> Control[Runtime control plane]
Control --> Session[Session + thread state]
Control --> Queue[Queue / resume / interrupt]
Control --> Loop[Execution loop]
Loop --> Provider[Model provider adapter]
Loop --> Tools[Tool / connector adapter]
Loop --> Context[Context / memory / policy resolver]
Loop --> Subagents[Subagent coordinator]
Provider --> Events[Typed runtime events]
Tools --> Events
Context --> Events
Subagents --> Events
Loop --> Events
Events --> Store[Durable event log + snapshots]
Store --> ReadModel[Thread read model]
Store --> Evidence[Evidence / replay / review export]
Store --> UI[Agent UI projection]
```
The runtime may keep internal provider-native records, but external consumers SHOULD receive normalized runtime events and snapshots.
## Required identity model
| Identity | Meaning | Required relationship |
| --- | --- | --- |
| `runtime_id` | Runtime installation or service instance. | Stable enough for trace attribution. |
| `session_id` | Durable user-visible work container. | Owns one or more threads. |
| `thread_id` | Ordered execution context. | Belongs to one session. |
| `turn_id` | One submitted input cycle. | Belongs to one thread. |
| `task_id` | Unit of work with objective, lifecycle, attempts, relationships, and acceptance. | Belongs to a session, thread, or parent task. |
| `run_id` / `attempt_id` | One execution attempt for a task. | Belongs to one task and may bind a thread, worker, or job item. |
| `step_id` | Ordered runtime item, such as status, message, tool, artifact, or action. | Belongs to a turn, task, or run. |
| `tool_call_id` | One tool invocation. | Belongs to a step and may have result refs. |
| `action_id` | One pending human or policy decision. | Belongs to a turn, task, or tool call. |
| `subagent_id` | Child agent execution context. | Has parent session/thread/turn links. |
| `artifact_id` | Durable deliverable reference. | Owned by artifact service; referenced by runtime. |
| `evidence_id` | Trace, replay, verification, or review reference. | Owned by evidence system; referenced by runtime. |
A compatible implementation MUST NOT rely on a single message id to represent all runtime work.
## Event envelope
Every emitted event SHOULD include:
| Field | Requirement |
| --- | --- |
| `type` | Required event class. |
| `event_id` | Required unique event id. |
| `timestamp` | Required producer timestamp. |
| `sequence` | Monotonic within a stream when possible. |
| `schema_version` | Runtime event schema version. |
| `session_id`, `thread_id`, `turn_id` | Present whenever the event belongs to a thread or turn. |
| `task_id`, `run_id`, `attempt_id`, `step_id`, `tool_call_id`, `action_id`, `subagent_id` | Present when applicable. |
| `trace_id`, `span_id` | Present when telemetry is available. |
| `payload` | Typed event payload. |
| `refs` | Stable references to large or owned external facts. |
Large tool outputs, artifacts, evidence packs, and raw provider payloads SHOULD be referenced, not copied into every event.
## Standard event classes
| Class | Purpose |
| --- | --- |
| `session.created` / `session.updated` | Session metadata changed. |
| `thread.started` / `thread.updated` | Thread lifecycle or read-model relevant state changed. |
| `turn.submitted` / `turn.started` / `turn.completed` / `turn.failed` | User or system turn lifecycle. |
| `task.created` / `task.accepted` / `task.queued` / `task.started` / `task.updated` / `task.progress` / `task.waiting` / `task.blocked` / `task.paused` / `task.resumed` / `task.retrying` / `task.cancel_requested` / `task.cancelled` / `task.timed_out` / `task.failed` / `task.lost` / `task.completed` / `task.archived` | Agent task lifecycle, progress, waiting, retry, cancellation, loss, and terminal state. |
| `run.status` | Human-readable runtime status with phase, title, detail, checkpoints, and metadata. |
| `model.requested` / `model.delta` / `model.completed` / `model.failed` | Provider adapter lifecycle and text/structured output stream. |
| `reasoning.delta` / `reasoning.summary` | Reasoning or planning stream outside final text. |
| `tool.catalog.resolved` | Tool inventory or capability surface was selected for the turn. |
| `tool.started` / `tool.args` / `tool.progress` / `tool.result` / `tool.failed` | Tool invocation lifecycle. |
| `action.required` / `action.resolved` | Runtime paused for user, policy, or structured input decision. |
| `queue.changed` | Queued turns changed order, state, or policy. |
| `context.resolved` | Context, memory, knowledge, source, or policy refs selected for a turn. |
| `context.compaction.started` / `context.compaction.completed` / `context.compaction.failed` | Context compaction boundary lifecycle. |
| `artifact.changed` | Runtime observed or produced an artifact reference. |
| `evidence.changed` | Runtime observed or exported evidence/replay/review reference. |
| `subagent.spawned` / `subagent.status` / `subagent.input` / `subagent.completed` / `subagent.failed` / `subagent.closed` | Child agent coordination. |
| `limit.changed` | Cost, quota, rate limit, budget, or policy limit changed. |
| `snapshot.updated` | Durable snapshot or read model changed. |
| `runtime.warning` / `runtime.error` | Non-fatal warning or fatal runtime error. |
Implementations may add vendor-specific event types, but MUST keep the normalized classes available for portable consumers.
### Expanded Event Families
Real coding, desktop, and remote runtimes SHOULD also expose these event families:
| Family | Events | Purpose |
| --- | --- | --- |
| Permission | `permission.evaluated` / `permission.requested` / `permission.resolved` | Record how rules, modes, hooks, classifiers, humans, or host policy decided. |
| Sandbox | `sandbox.applied` / `sandbox.violation` | Record actual execution boundaries and violations. |
| Hook / policy | `hook.started` / `hook.completed` / `hook.failed` / `policy.changed` | Record governance inputs, outcomes, duration, and failure behavior. |
| Process | `process.started` / `process.output` / `process.input` / `process.completed` / `process.failed` / `process.terminated` | Record commands, PTY sessions, long-running processes, and output refs. |
| Routing | `task.profile.resolved` / `routing.candidates.resolved` / `routing.decided` / `routing.fallback.applied` / `routing.not_possible` / `routing.single_candidate` | Explain model candidates, selection, fallback, blocking, and single-candidate paths. |
| Task orchestration | `task.delegated` / `task.dependency.updated` / `task.attempt.started` / `task.attempt.completed` / `task.attempt.failed` | Record task graph edges, delegation, dependencies, and per-attempt execution history. |
| Cost / limits | `cost.estimated` / `cost.recorded` / `rate_limit.hit` / `quota.low` / `quota.blocked` | Make cost, limits, and quota runtime facts. |
| Channel | `channel.connected` / `channel.disconnected` / `channel.resumed` / `channel.message` / `channel.permission_forwarded` / `channel.permission_returned` | Record remote channels, recovery, and cross-channel approval. |
| Jobs | `job.created` / `job.started` / `job.progress` / `job.item.started` / `job.item.completed` / `job.item.failed` / `job.completed` / `job.failed` / `job.cancelled` | Record batch and background work. |
| Output | `output.spilled` / `output.truncated` / `output.redacted` / `output.expired` | Manage large output and auditable references. |
| History | `history.window.loaded` / `history.reconstructed` / `history.rollback.started` / `history.rollback.completed` / `snapshot.repaired` | Recover old sessions, compaction, and rollback. |
| Benchmark | `benchmark.dataset.resolved` / `benchmark.configuration.resolved` / `benchmark.trial.started` / `benchmark.trial.completed` / `benchmark.trial.failed` / `benchmark.reward.recorded` / `benchmark.comparison.completed` | Record hill-climbing trials, trajectories, rewards, and baseline/candidate decisions. |
## Control plane
A compatible runtime SHOULD expose these commands, regardless of transport:
| Command | Required input | Result |
| --- | --- | --- |
| `submit_turn` | `session_id`, `thread_id` or create policy, input parts, options, metadata. | Accepted turn or queued turn. |
| `interrupt_turn` | `session_id`, optional `thread_id` / `turn_id`, reason. | Interrupt accepted or no-op. |
| `resume_thread` | `session_id`, `thread_id`, optional resume token. | Resume attempt result. |
| `create_task` / `update_task` / `start_task` / `append_task_progress` | Task objective, scope, profile, constraints, assignee, or progress refs. | Task lifecycle and progress events. |
| `pause_task` / `resume_task` / `cancel_task` / `retry_task` | `task_id`, reason, optional propagation policy. | Task pause, resume, cancellation, or new attempt facts. |
| `complete_task` / `fail_task` / `list_tasks` / `get_task` | Task scope or terminal facts. | Durable task read model or terminal reconciliation events. |
| `link_tasks` / `unlink_tasks` | Parent, child, dependency, source, artifact, evidence, or subagent edge. | Task graph update event. |
| `respond_action` | `action_id`, decision, optional structured payload. | Action resolved event. |
| `remove_queued_turn` / `promote_queued_turn` | `queued_turn_id`, target session/thread. | Queue changed event. |
| `get_session` | `session_id`, history window or cursor. | Durable session snapshot. |
| `get_thread_read` | `session_id`, `thread_id`. | Thread read model. |
| `get_tool_inventory` | Scope, caller, policy, runtime mode. | Tool inventory snapshot. |
| `spawn_subagent` / `send_subagent_input` / `wait_subagents` / `resume_subagent` / `close_subagent` | Parent ids and child control payload. | Subagent lifecycle facts. |
| `export_evidence` / `export_replay` | Session/thread/turn/task scope. | Stable evidence or replay refs. |
| `evaluate_permission` / `resolve_permission` | Tool/process/action scope and decision payload. | Permission evaluated/resolved event. |
| `get_execution_environment` | Session/thread/turn scope. | Environment snapshot. |
| `write_process_stdin` / `terminate_process` | `process_id`, input, or reason. | Process input / terminated event. |
| `list_subagents` / `list_jobs` / `get_job` / `cancel_job` | Session/thread/job scope. | Subagent graph or job snapshot. |
| `reconnect_channel` / `ack_events` | Channel id, cursor, resume token. | Channel resumed or snapshot repair. |
| `export_review` | Session/thread/turn/task scope. | Review refs. |
| `start_benchmark_trial` / `record_benchmark_reward` / `export_benchmark_trial` / `compare_benchmark_runs` | Dataset, task, config, trial, reward, and comparison scope. | Benchmark trial events, reward refs, trajectories, and promotion/revert decisions. |
Commands that mutate state MUST write through the runtime or owning adjacent system. UI-only state cannot mutate runtime truth.
## Durable snapshots and read models
The event stream is necessary but not enough. A compatible runtime SHOULD maintain:
- `session_snapshot`: shell, title, timestamps, threads, recent messages or steps, history cursor.
- `thread_read_model`: current status, active turn, pending requests, last outcome, incidents, queued turns, diagnostics.
- `tool_inventory_snapshot`: tools available for the current caller, policy, context, and mode.
- `queue_snapshot`: queued turn ids, order, source, policy, and resume state.
- `task_snapshot`: active, waiting, failed, lost, recent terminal tasks, task graph, current attempts, and delivery state.
- `context_boundary_snapshot`: selected refs, compaction summaries, context warnings, missing facts.
- `artifact_checkpoint_summary`: artifact refs, versions, previews, validation issue counts, diff refs.
- `evidence_summary`: trace ids, verification outcomes, replay refs, review refs, audit notes.
- `permission_sandbox_summary`: permission state, pending approvals, sandbox profile, and violation refs.
- `execution_environment_snapshot`: cwd, workspace roots, env refs, process limits, and active processes.
- `routing_limit_summary`: task profile, candidate count, routing decision, cost state, quota/rate-limit state.
- `subagent_job_summary`: child graph, job progress, assigned threads, and recoverability.
- `benchmark_summary`: dataset/task/config ids, trial status, reward refs, trajectory refs, aggregate delta, and promotion/revert decision.
- `channel_summary`: remote peers, resume cursors, last acknowledged sequence, and permission bridge state.
Read models may be compact. They must be honest: `unknown`, `unavailable`, `stale`, and `blocked` are better than inferred success.
## Completion and failure semantics
A runtime SHOULD distinguish:
- `accepted`: runtime received the request.
- `queued`: work is waiting behind another turn or policy gate.
- `preparing`: context, model, tools, or policy are being resolved.
- `running`: the execution loop is active.
- `waiting_input`: user or external structured input is required.
- `waiting_permission`: human, policy, or host approval is required.
- `waiting_resource`: credential, quota, file, network, worker, or external system is unavailable.
- `blocked`: an action, credential, policy, context, dependency, tool, or quota is missing.
- `streaming`: model or tool output is being emitted.
- `retrying`: retry or fallback is active.
- `lost`: runtime cannot prove whether the worker is still alive.
- `timed_out`: time or inactivity budget stopped the work.
- `completed`: owner declared work complete and durable facts are reconciled.
- `failed`: work cannot continue without a new request or repair.
- `cancelled`: user, policy, or runtime interrupted the work.
- `stale`: known snapshot may not reflect current execution.
`success` from a provider or tool is not the same as completed agent work. Completion must be tied to runtime state and, when required, artifact or evidence facts.
## Validation
A validator SHOULD check behavior, not only file presence:
- Events contain stable ids and can be replayed into a read model.
- Provider streams map to normalized model/text/reasoning events.
- Tool calls preserve input refs, result refs, errors, and policy decisions.
- Human actions pause execution and resume only through `respond_action`.
- Queue mutations survive restart and emit `queue.changed`.
- Task lifecycle survives restart, keeps prior attempts, and can recover parent/child and dependency edges.
- Old sessions hydrate through snapshots and cursor windows.
- Evidence/replay exports derive from the same runtime facts as UI and diagnostics.
- Missing facts are marked `unknown`, `unavailable`, `stale`, or `blocked` instead of inferred from prose.
# Runtime model
Source: https://limecloud.github.io/agentruntime/en/concepts/runtime-model
# Runtime model
Agent Runtime uses a small identity hierarchy so long-running work can be resumed, inspected, delegated, and audited.
```text
runtime
session
task
run
thread
turn
step
tool_call | action_request | artifact_ref | evidence_ref
subagent
```
## Identities
- `session`: durable user-visible container. It may map to a conversation, workspace task, remote channel thread, or workflow job.
- `task`: durable unit of work with objective, lifecycle, attempts, relationships, acceptance, and recovery state. It may belong to a thread, span multiple turns, or run in the background.
- `run`: one execution attempt for a task. Retries, resumes, and alternate worker executions should create new runs instead of overwriting task history.
- `thread`: ordered execution context inside a session. A session can contain multiple threads when work branches, delegates, or runs in parallel.
- `turn`: one submitted input cycle. It starts when work is accepted or queued and ends when completed, failed, or cancelled.
- `step`: ordered runtime item such as status, text, reasoning, tool call, approval request, artifact, warning, or evidence link.
- `subagent`: child runtime context with parent ids and its own lifecycle.
- `artifact_ref` and `evidence_ref`: stable references to owned systems, not copied content.
## Ownership rule
The runtime owns identity, status, sequencing, queue state, and action lifecycle. Adjacent systems own their own facts:
| Fact | Owner |
| --- | --- |
| Model output chunks and provider errors | Provider adapter, normalized by runtime. |
| Tool schema and external execution | Tool or connector system, orchestrated by runtime. |
| Memory, search, and knowledge facts | Context system, selected by runtime. |
| Artifact bytes and versions | Artifact service, referenced by runtime. |
| Verification and review verdicts | Evidence system, exported by runtime. |
| Visible UI state | Agent UI or host product, never runtime truth. |
## Correlation
A compatible runtime SHOULD carry correlation ids through every boundary:
- `trace_id` and `span_id` for telemetry.
- `request_id` for transport or API request correlation.
- `turn_id` and `tool_call_id` for tool and provider calls.
- `task_id` and `run_id` for task lifecycle, retries, and background work.
- `action_id` for human decisions.
- `artifact_id` and `evidence_id` for durable refs.
If correlation is unavailable, the runtime should mark the gap. It should not invent a false join.
## Public model and product profiles
The public runtime model defines ownership and portable ids. A product profile may make this stricter for one implementation. For example, the Lime profile requires every current runtime event to carry `runtimeId`, `sessionId`, `eventId`, `timestamp`, `schemaVersion`, `sequence`, and a typed `payload`, and then adds scope-specific ids for turns, tools, actions, tasks, subagents, and evidence.
Profiles are conformance layers, not separate standards. They must preserve the same owner map so UI, evidence, context, policy, tool, and artifact systems do not become competing runtime truth sources.
# Runtime event stream
Source: https://limecloud.github.io/agentruntime/en/contracts/runtime-event-stream
# Runtime event stream
The runtime event stream is the canonical stream of execution facts. It can be delivered through Server-Sent Events, WebSocket, JSON-RPC notifications, local process events, a message bus, or an embedded API. The transport is not the standard; the normalized event envelope is.
## Envelope
```json
{
"type": "tool.started",
"eventId": "evt_01",
"schemaVersion": "0.1.0",
"timestamp": "2026-05-08T02:30:00Z",
"sequence": 42,
"sessionId": "sess_123",
"threadId": "thread_123",
"turnId": "turn_123",
"taskId": "task_123",
"runId": "run_123",
"stepId": "step_123",
"toolCallId": "tool_123",
"traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
"spanId": "00f067aa0ba902b7",
"payload": {
"toolName": "read_file",
"safeArgs": { "path": "README.md" },
"policy": "allowed"
}
}
```
## Ordering
Events SHOULD be ordered by `sequence` within a stream. Consumers MUST tolerate reconnection, duplicate delivery, and snapshot repair. A later snapshot can correct earlier events, but it must not silently erase an unresolved action or failed tool call.
## Large data
The stream should carry previews and refs for large data:
- tool arguments and results can use `input_ref`, `output_ref`, `preview`, and `metadata`.
- artifacts can use `artifact_id`, `read_ref`, `version_id`, `preview_ref`, and `diff_ref`.
- evidence can use `evidence_id`, `pack_ref`, `trace_ref`, `replay_ref`, and `review_ref`.
- benchmark trials can use `trajectory_ref`, `reward_ref`, `reward_details_ref`, `dataset_id`, `configuration_id`, and `comparison_ref`.
## Item and Process Lifecycle
Runtime SHOULD support both coarse lifecycle and item lifecycle:
- `turn.*` describes one input cycle.
- `task.*` describes objective lifecycle, waiting, progress, retry, cancellation, terminal state, and task graph updates.
- `item.*` describes ordered agent message, reasoning, tool call, command, file change, web search, todo, and error items.
- `process.*` describes commands, PTY sessions, and long-running processes.
- `hook.*`, `permission.*`, and `sandbox.*` describe governance.
SDKs can consume `item.*` while audit and GUI consumers can read deeper runtime facts.
## Routing, Limits, and Remote Events
Model routing and remote channels also belong to the stream:
- `task.profile.resolved`, `routing.candidates.resolved`, `routing.decided`.
- `task.attempt.started`, `task.attempt.completed`, and `task.attempt.failed` for per-run execution history.
- `cost.estimated`, `cost.recorded`, `rate_limit.hit`, `quota.low`, `quota.blocked`.
- `channel.connected`, `channel.resumed`, `channel.permission_forwarded`.
- `benchmark.dataset.resolved`, `benchmark.trial.started`, `benchmark.reward.recorded`, and `benchmark.comparison.completed`.
These events may be telemetry-only or read-model-only, but they must be joinable by evidence, replay, review, and benchmark exports.
## Provider adaptation
Provider-native streams differ. A runtime adapter SHOULD map them into:
- `model.requested` when a provider call starts.
- `model.delta` for text, structured output, or provider content chunks.
- `reasoning.delta` or `reasoning.summary` when reasoning is exposed separately.
- `tool.started` / `tool.args` when the provider asks for a tool call.
- `model.completed` with usage and stop reason.
- `model.failed` for provider errors.
The runtime should preserve raw provider refs for debugging but expose normalized facts to portable consumers.
## Recovery
A client that reconnects SHOULD request a snapshot, then resume from the last seen event sequence if supported. If the runtime cannot provide replay from a sequence, it should emit `snapshot.updated` and mark the stream recovery mode.
# Control plane
Source: https://limecloud.github.io/agentruntime/en/contracts/control-plane
# Control plane
The control plane is the write boundary for runtime state. It may be implemented as HTTP, JSON-RPC, local commands, a worker API, or in-process calls. Names can vary, but semantics should remain stable.
## Commands
| Command | Semantics |
| --- | --- |
| `submit_turn` | Accept user or system input, create or select session/thread, and start or queue a turn. |
| `interrupt_turn` | Request cancellation of current work and clear or preserve queued work according to policy. |
| `resume_thread` | Continue a thread after restart, queue pause, provider continuation, or blocked state. |
| `create_task` / `update_task` | Create or update an agent task with objective, profile, constraints, owner, acceptance, and idempotency. |
| `start_task` / `append_task_progress` | Start a task run or append phase, counters, progress summaries, delivery state, or output refs. |
| `pause_task` / `resume_task` / `cancel_task` / `retry_task` | Mutate task execution state while preserving attempts and graph edges. |
| `complete_task` / `fail_task` | Reconcile terminal task state with artifacts, evidence, delivery, and errors. |
| `list_tasks` / `get_task` / `link_tasks` / `unlink_tasks` | Read task state and update parent, dependency, source, artifact, evidence, or subagent edges. |
| `respond_action` | Resolve a pending human or policy request. |
| `remove_queued_turn` | Remove a queued turn by id. |
| `promote_queued_turn` | Move a queued turn ahead according to policy. |
| `get_session` | Return shell, recent history, thread summaries, and cursor metadata. |
| `get_thread_read` | Return current thread status, pending requests, last outcome, incidents, diagnostics, and queue state. |
| `get_tool_inventory` | Return tools available under current scope and policy. |
| `spawn_subagent` | Create a child runtime context with parent links and isolation rules. |
| `send_subagent_input` | Send structured or text input to a child context. |
| `wait_subagents` | Wait for one or more child contexts. |
| `close_subagent` | Ask a child context to stop and release resources. |
| `export_evidence` | Export evidence pack from runtime facts. |
| `export_replay` | Export replay case from the same facts. |
| `evaluate_permission` / `resolve_permission` | Let host policy, hooks, or approval systems participate in permission decisions. |
| `get_execution_environment` | Return cwd, workspace roots, sandbox, network, and process limits. |
| `write_process_stdin` / `terminate_process` | Interact with a long-running process or PTY session. |
| `list_subagents` | Return parent-child graph and child thread state. |
| `create_job` / `get_job` / `cancel_job` | Manage durable background or batch work. |
| `reconnect_channel` / `ack_events` | Recover remote channels and acknowledge events. |
| `export_review` | Export review template or audit refs from the same facts. |
| `start_benchmark_trial` | Bind dataset, task, configuration, Harbor job/trial, sandbox, and timeout to a runtime run. |
| `record_benchmark_reward` | Attach reward, reward details, verifier status, failure category, and criterion summary to a trial. |
| `export_benchmark_trial` | Export trajectory, runtime transcript, artifacts, reward refs, artifact manifest, and Agent QC refs. |
| `compare_benchmark_runs` | Record baseline/candidate deltas, cost, evidence completeness, and promotion or revert decision. |
## Idempotency
Mutating commands SHOULD accept an idempotency key or caller-provided ids. Retrying `submit_turn` must not create duplicate turns when the caller provides a stable `turn_id`.
## Action requests
An `action.required` event pauses execution when a decision is needed. It must include:
- stable `action_id`
- `action_type`
- scope ids
- prompt or structured schema
- available decisions
- policy and timeout metadata when applicable
The runtime may continue unrelated tasks, but it must not treat an unresolved action as approved.
## Queue and resume
Queue state is runtime-owned. A busy thread can accept new input as queued work only if policy allows it. Queue snapshots must survive restart. Resume should be explicit when the runtime cannot prove that background work is already active.
## Tasks
Task commands MUST write runtime facts. A task retry should create a new run or attempt instead of overwriting the previous attempt. Cancellation should record intent first, then propagate to active child tasks, jobs, processes, or subagents according to policy.
## Process and Channel Control
`write_process_stdin` and `terminate_process` MUST target a runtime-known `process_id`. If the process cannot be recovered, return `unavailable` and emit a repair or warning event.
`reconnect_channel` SHOULD receive channel id, last acknowledged sequence, and resume token. The runtime should return a snapshot first, then replay events where possible.
## Jobs
Job control SHOULD distinguish job status from job item status. Cancelling a job does not cancel completed items; retrying an item must not create duplicate output.
## Benchmark control
Benchmark commands do not replace `export_evidence`. They give hill-climbing runs stable ids and keep baseline/candidate comparisons honest. If an implementation omits these explicit commands, equivalent data must be exported through `export_evidence` or `export_replay`.
# Tool and context
Source: https://limecloud.github.io/agentruntime/en/contracts/tool-context-capabilities
# Tool and context
Agent Runtime coordinates tools and context without owning every external capability.
## Tool inventory
A `tool.catalog.resolved` event or inventory response SHOULD include:
| Field | Meaning |
| --- | --- |
| `tool_name` | Stable tool name for the turn. |
| `description` | Safe user-facing summary. |
| `input_schema` | JSON Schema or equivalent schema ref. |
| `capabilities` | Read, write, network, browser, filesystem, shell, artifact, or custom flags. |
| `policy` | Allowed, ask, denied, sandboxed, or unavailable. |
| `runtime_owner` | Local, MCP server, hosted connector, provider-native, workflow, or subagent. |
| `metadata_ref` | Optional ref for large or private metadata. |
## Tool invocation
Tool lifecycle events should preserve:
- `tool_call_id`
- safe arguments or argument ref
- policy decision and approval links
- progress and partial output
- result ref, preview, images, artifacts, or evidence refs
- error category, retryability, and recovery advice
- concurrency safety, read-only/destructive flags, and interrupt behavior
- pre/post hook outcomes and permission/sandbox refs
Tool results should not be flattened into final assistant text.
## Context assembly
A runtime SHOULD emit `context.resolved` when it selects important context:
- memory refs
- knowledge/source refs
- workspace or file refs
- browser/session refs
- policy facts
- project or system instruction refs
- context omissions or missing facts
## Compaction
Context compaction is a runtime boundary. It should emit start/completed/failed events with trigger, summary preview, affected turns, and downstream continuation refs. Compaction must not erase unresolved action requests, active incidents, or evidence links.
## Model routing
Model routing and fallback should be visible as runtime facts:
- requested provider and model
- capability requirements
- candidate set
- selected candidate
- fallback chain
- budget and rate-limit state
- decision reason
This lets UI, replay, and review explain why a runtime behaved the way it did.
## Concurrency and Interrupt
Tool inventory SHOULD mark `is_read_only`, `is_concurrency_safe`, `is_destructive`, and `interrupt_behavior`.
The runtime MAY run consecutive read-only tools concurrently, but SHOULD serialize write or destructive tools. Cancelled tools should emit explicit cancelled or interrupted facts instead of silently dropping results.
# State snapshots
Source: https://limecloud.github.io/agentruntime/en/contracts/state-snapshots
# State snapshots
Snapshots let consumers recover without replaying every event from the beginning. They are also the bridge for old sessions, inactive tabs, remote clients, and evidence exporters.
## Session snapshot
A session snapshot SHOULD include:
- session id, title, timestamps, workspace or account scope
- active thread ids and pinned thread ids
- recent history window with cursor
- thread summaries
- task summaries and active task graph refs
- child subagent summaries
- latest evidence and artifact refs
- stale or truncated flags
## Thread read model
A thread read model SHOULD include:
- `thread_id`
- `status`
- `active_turn_id`
- pending action requests
- last outcome
- active incidents
- queued turns
- active, waiting, blocked, failed, lost, and recent terminal tasks
- latest compaction boundary
- diagnostics summary
- tool/artifact/evidence summaries
- model routing and limit state
- permission state, sandbox profile, and pending approvals
- active processes, output refs, and execution environment
- task graph, subagent graph, job progress, and remote channel state
- telemetry correlation summary
This read model is the current answer to “what is happening now?” Consumers should not compute it independently from UI state.
## Diagnostics
Diagnostics can include warnings, failed tools, failed commands, pending requests, stalled turns, stalled tasks, lost workers, interrupt state, quota blocks, and context gaps. A missing diagnostic is not the same as a healthy state; mark unsupported facts as `unavailable`.
## History windows
Old sessions should load progressively:
1. session shell and thread summaries
2. recent history window
3. active thread read model
4. queue and pending requests
5. tool, artifact, evidence, and older history on demand
Bounded history and cursors are part of the runtime contract because they define whether clients can restore long-running work safely.
## Snapshot Honesty
Snapshots SHOULD prefer explicit status over optimistic inference:
- `unknown`: runtime lacks enough facts.
- `unavailable`: implementation or environment does not support the fact.
- `not_applicable`: the signal does not apply to this thread.
- `stale`: facts may not be current.
- `blocked`: permission, credential, quota, network, context, or human action is required.
Evidence, review, and UI consumers should use these statuses instead of filling in success defaults.
# Evidence and replay
Source: https://limecloud.github.io/agentruntime/en/contracts/evidence-replay
# Evidence and replay
Agent Runtime should make work auditable. Evidence and replay exports must be derived from the same runtime facts that drive UI and diagnostics.
## Evidence pack
An evidence pack SHOULD include:
- runtime summary
- event or timeline summary
- thread read model
- tool calls and failed calls
- artifacts and artifact refs
- context refs and compaction boundaries
- provider routing, permission, sandbox, hook, process, and limit events
- verification outcomes when available
- review or audit notes when available
Evidence systems own verdicts. Runtime owns export scope, correlation ids, and references.
## Replay case
A replay case SHOULD include:
- input payload and selected options
- context and tool inventory refs
- expected state or behavior assertions
- evidence links
- grader or review instructions when available
Replay must not depend on UI screenshots or final prose alone.
## Benchmark export
When evidence is exported for Agent QC benchmark or hill climbing, the pack SHOULD also include:
- dataset id, dataset version, task id, trial id, and configuration id;
- Harbor job/trial ref, task.toml ref, and artifact manifest ref;
- baseline/candidate role and the single changed variable when known;
- trajectory ref, runtime transcript ref, reward ref, and reward details ref;
- aggregate comparison refs when the export belongs to a candidate decision;
- P0 Agent QC report ref used to detect regressions;
- timeout, cost, token, cache, and cleanup metrics when available.
Benchmark export is not a verdict. Agent QC owns promotion, revert, blocked, or needs-review decisions.
## Observability
A compatible runtime SHOULD map execution into trace concepts:
- session/thread/turn as trace attributes
- model call, tool call, context retrieval, artifact write, and export as spans
- warnings and decisions as span events or logs
- token usage, latency, retries, queue wait, and tool duration as metrics
Trace ids should appear in runtime events and evidence exports when available.
## Signal Applicability
Evidence SHOULD distinguish `exported`, `not_applicable`, `unsupported`, and `missing_correlation`.
`known_gaps` should only describe signals that apply to the current scope but were not exported. Do not turn every future capability into a gap for every thread.
# Permission and sandbox
Source: https://limecloud.github.io/agentruntime/en/contracts/permission-and-sandbox
# Permission and sandbox
Permission is not a dialog. Permission is an auditable runtime decision across tools, processes, network, filesystem, host policy, and human approval.
A compatible runtime SHOULD expose three facts for every constrained action:
1. `permission_state`: current mode, rules, policy sources, and whether user interaction is allowed.
2. `permission_decision`: why this action is allowed, denied, asked, unavailable, or sandboxed.
3. `sandbox_profile`: the actual filesystem, network, environment, process, and platform boundary used at execution time.
## Permission modes
The standard does not require exact names, but implementations SHOULD express these semantics: `default`, `untrusted`, `on_request`, `on_failure`, `never`, `plan`, `bypass`, and `auto`.
Every mode must preserve destructive-action facts. A destructive action SHOULD carry `destructive=true`, impact scope, rollback advice, and decision source.
## Permission decision
A permission decision SHOULD include `decision`, `decision_source`, `decision_reason`, `rule_refs`, `updated_input_ref`, `approval_action_id`, `expires_at`, and `scope`.
A hook-provided allow is a decision source, not a universal override. Explicit deny or ask rules SHOULD still apply.
## Sandbox profile
A sandbox profile SHOULD include `mode`, `cwd`, `read_roots`, `write_roots`, `network`, `environment_ref`, `process_limits`, and `violation_refs`.
The runtime MUST NOT show a sandboxed state in a client while omitting the actual sandbox facts from tool, process, and evidence records.
## Event classes
| Event | When |
| --- | --- |
| `permission.evaluated` | Rules, mode, hooks, host policy, or classifiers produced a decision. |
| `permission.requested` | Runtime needs a human or host decision. |
| `permission.resolved` | The action was allowed, denied, timed out, or cancelled. |
| `sandbox.applied` | A tool or process received its execution sandbox. |
| `sandbox.violation` | Filesystem, network, or permission boundary was triggered. |
## Anti-patterns
- Inferring approval from final prose.
- Recording only the dialog result, not the rule and sandbox facts.
- Waiting for user input in a non-interactive mode.
- Reporting sandbox denial as a generic tool failure.
# Hooks and policy
Source: https://limecloud.github.io/agentruntime/en/contracts/hooks-and-policy
# Hooks and policy
Hooks are runtime governance points. They can add context, block input, modify tool arguments, participate in permission decisions, emit audit facts, or run stop checks.
Hooks must not become a second execution path. Hook outcomes must be written back as runtime facts consumed by events, snapshots, evidence, and review.
## Hook points
Compatible runtimes SHOULD support equivalent semantics for `session_start`, `user_prompt_submit`, `pre_tool_use`, `permission_request`, `post_tool_use`, `post_tool_failure`, `post_sampling`, and `stop`.
## Hook input
Hook input SHOULD contain stable scoped identifiers, cwd, workspace scope, permission mode, sandbox summary, safe tool input, transcript refs, routing refs, and policy refs. It MUST NOT include secrets, raw private files, unfiltered environment variables, or unaudited client state by default.
## Hook output
Hook output SHOULD normalize to `continue`, `block`, `allow`, `deny`, `ask`, `updated_input_ref`, `additional_context_refs`, `updated_tool_output_ref`, `suppress_output`, and `audit_refs`.
## Events
| Event | Payload |
| --- | --- |
| `hook.started` | Hook point, handler ref, scope ids, timeout, preview summary. |
| `hook.completed` | Status, duration, decision, added context refs, updated refs. |
| `hook.failed` | Error category, retryability, fail-open or fail-closed behavior. |
| `policy.changed` | Host, workspace, or session policy changed. |
High-risk hooks SHOULD fail closed. Read-only context hooks MAY fail open, but must emit a warning.
# Execution environment
Source: https://limecloud.github.io/agentruntime/en/contracts/execution-environment
# Execution environment
Real runtimes execute local commands, remote commands, PTY sessions, browser actions, and long-running background processes. The standard does not bind a runner, but execution environment facts must be observable, recoverable, and auditable.
## Environment snapshot
Each turn SHOULD form an `execution_environment` snapshot with `workspace_id`, `cwd`, `additional_roots`, `shell`, `os`, `arch`, `env_ref`, `sandbox_profile`, `network_profile`, and `resource_limits`.
## Process lifecycle
Commands and long-running processes SHOULD use their own lifecycle:
| Event | When |
| --- | --- |
| `process.started` | Command, cwd, sandbox, TTY, and process id are known. |
| `process.output` | stdout, stderr, or binary chunk emitted. |
| `process.input` | Runtime writes to stdin. |
| `process.completed` | Exit code, duration, status, and output refs are known. |
| `process.failed` | Spawn, sandbox, permission, timeout, or transport failure. |
| `process.terminated` | User, policy, parent cancellation, or shutdown ended the process. |
Large output SHOULD be stored by reference.
## Command policy
Before execution, runtime SHOULD record safe command summary, parsed subcommands, destructive flag, permission decision, sandbox profile, cwd, path validation, network decision, and justification.
## Recovery
After restart, the runtime SHOULD tell which processes may still exist, which outputs are durable, which refs are unrecoverable, and which actions remain unresolved.
# Model routing and limits
Source: https://limecloud.github.io/agentruntime/en/contracts/model-routing-limits
# Model routing and limits
Model selection is not a client dropdown. It is an explainable runtime decision across task requirements, candidate capabilities, user locks, host policy, cost, limits, and provider state.
Compatible runtimes SHOULD expose `task_profile`, `candidate_model_set`, and `routing_decision`.
## Task profile
A task profile SHOULD include `kind`, `source`, `required_capabilities`, `latency_target`, `budget_class`, `fallback_policy`, and `settings_source`.
## Candidate model set
A candidate set is not the full registry. It describes the candidates actually available for this turn, including candidate count, capability and cost metadata, excluded candidates with reasons, hard constraints, preferences, single-candidate status, credentials, quota, and network gaps.
## Routing decision
A routing decision SHOULD include `routing_mode`, `decision_source`, `decision_reason`, selected and requested provider/model, `candidate_count`, `fallback_chain`, `capability_gap`, `requires_user_override`, and `limit_state_snapshot`.
Single candidate is a first-class path, not a failure. It means the runtime has no optimization space, but still owes capability, cost, and limit facts.
## Events
`task.profile.resolved`, `routing.candidates.resolved`, `routing.decided`, `routing.fallback.applied`, `routing.not_possible`, `routing.single_candidate`, `cost.estimated`, `cost.recorded`, `rate_limit.hit`, `quota.low`, and `quota.blocked` SHOULD flow into read models, evidence, and telemetry.
# Agent task
Source: https://limecloud.github.io/agentruntime/en/contracts/agent-task
# Agent task
An `agent task` is the runtime-owned unit of work that gives an agent objective, scope, lifecycle, progress, relationships, and delivery semantics.
A task is not a checklist item, not a chat message, not a model request, and not only a background job. It is the durable execution object that can span turns, start runs, spawn subagents, wait for input, produce artifacts, and be audited after recovery.
## Design pressure
Real runtimes show the same task pressure in different forms:
- Terminal agents keep foreground work, local shell work, remote agent work, scheduled work, and backgrounded work under one task surface.
- Gateway and scheduler agents need durable jobs, delivery state, per-run output, checkpoint/resume, missed-run handling, and failure notification.
- Typed coding runtimes need thread goals, todo lists, plan items, turn status, job items, approval state, and spawn edges to join through stable ids.
- Desktop runtimes need task profiles, automation jobs, subagent turns, execution summaries, evidence export, and UI projection to read from the same fact chain.
- Durable workflow systems show why workflow id, run id, task queue, child work, signals, cancellation, retries, and history cannot be left as prose.
The portable contract therefore needs a task model above individual tool calls and below host-product workflows.
## Task is not job, run, step, or todo
| Object | Meaning | Runtime rule |
| --- | --- | --- |
| `task` | Semantic objective with lifecycle, owner, relationships, constraints, and acceptance. | Stable across retries, turns, backgrounding, and recovery. |
| `run` | One execution attempt for a task. | New run for retry, resume-after-crash, or alternate worker execution. |
| `step` | Ordered item inside a run or turn. | Model, reasoning, tool, process, approval, artifact, warning, or evidence item. |
| `job` | Durable batch or scheduled dispatcher. | May create or own tasks, but should not replace task semantics. |
| `todo` / `plan item` | Agent-visible checklist. | Useful progress hint, not authoritative lifecycle. |
A compatible runtime SHOULD expose all five concepts when they exist instead of flattening them into one `message` or one `task` string.
## Task record
A task SHOULD include these fields:
| Field | Purpose |
| --- | --- |
| `task_id` | Stable task id. |
| `parent_task_id` / `root_task_id` | Task graph linkage. |
| `session_id` / `thread_id` / `turn_id` | Conversation or execution context linkage when applicable. |
| `title` / `objective` | Human-readable work statement. |
| `task_kind` / `task_family` | Portable classification and coarse grouping. |
| `visibility` | `foreground`, `background`, `internal`, or `hidden`. |
| `status` | Normalized lifecycle status. |
| `priority` | Scheduling hint, not a completion guarantee. |
| `requested_by` / `owner` / `assignee` | User, agent, workflow, channel, or worker attribution. |
| `scope` | Workspace, project, thread, account, environment, or host boundary refs. |
| `constraints` | Permission, sandbox, network, model, tool, cost, time, and output constraints. |
| `task_profile` | Capability, latency, budget, fallback, and continuity profile. |
| `acceptance` | Completion criteria or refs. |
| `progress` | Percent, phase, current step, summary, counters, or output refs. |
| `current_run_id` | Active run, if any. |
| `attempts` | Prior and active runs. |
| `relationships` | Dependencies, blocks, child tasks, source tasks, spawned agents, artifacts. |
| `artifacts` / `evidence_refs` | Produced or consumed refs. |
| `last_error` / `status_reason` | Structured failure, block, or wait explanation. |
| `created_at` / `updated_at` / `started_at` / `ended_at` | Lifecycle timestamps. |
## Status model
A portable runtime SHOULD support these normalized task statuses:
| Status | Meaning |
| --- | --- |
| `draft` | Defined but not yet accepted by runtime. |
| `accepted` | Runtime accepted the task and assigned identity. |
| `queued` | Waiting for scheduler, queue, dependency, or worker capacity. |
| `preparing` | Resolving context, tools, model, policy, or environment. |
| `running` | Active execution is making progress. |
| `waiting_input` | Waiting for user or external structured input. |
| `waiting_permission` | Waiting for human, policy, or host approval. |
| `waiting_resource` | Waiting for credential, quota, file, network, worker, or external system. |
| `blocked` | Cannot proceed until a named blocker is resolved. |
| `paused` | Intentionally paused and resumable. |
| `retrying` | A retry or fallback run is being prepared or active. |
| `cancelling` | Cancellation requested; cleanup is in progress. |
| `cancelled` | Stopped by user, policy, or runtime. |
| `timed_out` | Stopped because a time or inactivity limit fired. |
| `failed` | Terminal failure with error facts. |
| `lost` | Runtime cannot prove whether the worker is still alive. |
| `completed` | Acceptance criteria are satisfied and facts are reconciled. |
| `archived` | No longer active in scheduling, but retained for history. |
| `stale` | Snapshot may not reflect current execution. |
| `unknown` | Runtime lacks enough facts to assert a status. |
Implementation-native states MAY be preserved in `native_status`, but portable consumers SHOULD receive the normalized status.
## Attempts and runs
A task SHOULD keep attempts rather than replacing history on retry.