This document describes the complete input contract used by AsyncFlow to run a simulation, the design rationale behind it, and the validation guarantees enforced by the Pydantic layer. At the end you’ll find an end-to-end YAML example you can run as-is.
The entry point is:
class SimulationPayload(BaseModel):
"""Full input structure to perform a simulation"""
rqs_input: RqsGenerator
topology_graph: TopologyGraph
sim_settings: SimulationSettingsEverything the engine needs is captured by these three components:
rqs_input— workload model (how traffic is generated).topology_graph— system under test as a directed graph (nodes & edges).sim_settings— global simulation controls and which metrics to collect.
- Workload (traffic intensity & arrival process) is independent from topology (architecture) and simulation control (duration & metrics).
- You can reuse the same topology with different workloads (or vice versa) without touching unrelated parts.
- Inputs are typed and validated before the engine starts.
- Validation catches type errors, dangling references, illegal step definitions, and inconsistent graphs.
- Once a payload parses, the runtime code can remain lean (no defensive checks scattered everywhere).
- The smallest unit is a
Step(one resource-bound operation). - Steps compose into an
Endpoint(ordered workflow). - Endpoints live on a
Servernode with finite resources. - Nodes and Edges form a
TopologyGraph. - A closed set of Enums eliminates magic strings.
Purpose: Defines the stochastic traffic generator that produces request arrivals.
class RqsGenerator(BaseModel):
id: str
type: SystemNodes = SystemNodes.GENERATOR
avg_active_users: RVConfig
avg_request_per_minute_per_user: RVConfig
user_sampling_window: int = Field(
default=TimeDefaults.USER_SAMPLING_WINDOW,
ge=TimeDefaults.MIN_USER_SAMPLING_WINDOW,
le=TimeDefaults.MAX_USER_SAMPLING_WINDOW,
)class RVConfig(BaseModel):
mean: float
distribution: Distribution = Distribution.POISSON
variance: float | None = NoneValidators & guarantees
-
meanis numeric and coerced tofloat. (Non-numeric →ValueError.) -
If
distribution ∈ {NORMAL, LOG_NORMAL}andvariance is None, thenvariance := mean. -
Workload-specific constraints:
avg_request_per_minute_per_user.distributionmust bePOISSON.avg_active_users.distributionmust bePOISSONorNORMAL.
-
user_sampling_windowis an integer in seconds, bounded to[1, 120].
Why these constraints? They match the currently implemented samplers (Poisson–Poisson and Normal–Poisson).
Purpose: Describes the architecture as a directed graph. Nodes are macro-components (client, server, optional load balancer); edges are network links with latency models.
class TopologyGraph(BaseModel):
nodes: TopologyNodes
edges: list[Edge]class TopologyNodes(BaseModel):
servers: list[Server]
client: Client
load_balancer: LoadBalancer | None = None
# also: model_config = ConfigDict(extra="forbid")class Client(BaseModel):
id: str
type: SystemNodes = SystemNodes.CLIENT
# validator: type must equal SystemNodes.CLIENTclass NodesResources(BaseModel):
cpu_cores: PositiveInt = Field(NodesResourcesDefaults.CPU_CORES,
ge=NodesResourcesDefaults.MINIMUM_CPU_CORES)
db_connection_pool: PositiveInt | None = Field(NodesResourcesDefaults.DB_CONNECTION_POOL)
ram_mb: PositiveInt = Field(NodesResourcesDefaults.RAM_MB,
ge=NodesResourcesDefaults.MINIMUM_RAM_MB)Each attribute maps directly to a SimPy primitive (core tokens, RAM container, optional DB pool).
class Step(BaseModel):
kind: EndpointStepIO | EndpointStepCPU | EndpointStepRAM
step_operation: dict[StepOperation, PositiveFloat | PositiveInt]Coherence validator
-
step_operationmust contain exactly one key. -
Valid pairings:
- CPU step →
{ cpu_time: PositiveFloat } - RAM step →
{ necessary_ram: PositiveInt | PositiveFloat } - I/O step →
{ io_waiting_time: PositiveFloat }
- CPU step →
-
Any mismatch (e.g., RAM step with
cpu_time) →ValueError.
class Endpoint(BaseModel):
endpoint_name: str
steps: list[Step]
@field_validator("endpoint_name", mode="before")
def name_to_lower(cls, v): return v.lower()Canonical lowercase names avoid accidental duplicates by case.
class Server(BaseModel):
id: str
type: SystemNodes = SystemNodes.SERVER
server_resources: NodesResources
endpoints: list[Endpoint]
# validator: type must equal SystemNodes.SERVERclass LoadBalancer(BaseModel):
id: str
type: SystemNodes = SystemNodes.LOAD_BALANCER
algorithms: LbAlgorithmsName = LbAlgorithmsName.ROUND_ROBIN
server_covered: set[str] = Field(default_factory=set)
# validator: type must equal SystemNodes.LOAD_BALANCERclass Edge(BaseModel):
id: str
source: str # may be an external entrypoint (e.g., generator id)
target: str # MUST be a declared node id
latency: RVConfig
edge_type: SystemEdges = SystemEdges.NETWORK_CONNECTION
dropout_rate: float = Field(NetworkParameters.DROPOUT_RATE,
ge=NetworkParameters.MIN_DROPOUT_RATE,
le=NetworkParameters.MAX_DROPOUT_RATE)
# validator: source != target
# validator on latency: mean > 0, variance >= 0 if providedNote: The former
probabilityfield has been removed. Fan-out is controlled at the load balancer viaalgorithms(e.g., round-robin, least-connections). Non-LB nodes are not allowed to have multiple outgoing edges (see graph-level validators below).
The TopologyGraph class performs several global checks:
-
Unique edge IDs
- Duplicate edge ids →
ValueError.
- Duplicate edge ids →
-
Referential integrity
- Every
targetmust be a declared node (client, anyserver, optionalload_balancer). - External IDs (e.g., generator id) are allowed only as sources and must never appear as a target anywhere.
- Every
-
Load balancer integrity (if present)
server_covered ⊆ declared server ids.- There must be an outgoing edge from the LB to every covered server; missing links →
ValueError.
-
Fan-out restriction
- Among declared nodes, only the load balancer may have multiple outgoing edges.
- Edges originating from non-declared external sources (e.g., generator) are ignored by this check.
- Violations list the offending source ids.
class SimulationSettings(BaseModel):
total_simulation_time: int = Field(
default=TimeDefaults.SIMULATION_TIME,
ge=TimeDefaults.MIN_SIMULATION_TIME,
)
enabled_sample_metrics: set[SampledMetricName] = Field(default_factory=...)
enabled_event_metrics: set[EventMetricName] = Field(default_factory=...)
sample_period_s: float = Field(
default=SamplePeriods.STANDARD_TIME,
ge=SamplePeriods.MINIMUM_TIME,
le=SamplePeriods.MAXIMUM_TIME,
)What it controls
-
Clock —
total_simulation_timein seconds (default 3600, min 5). -
Sampling cadence —
sample_period_sin seconds (default 0.01; bounds[0.001, 0.1]). -
Metric selection — default sets include:
- Sampled (time-series):
ready_queue_len,event_loop_io_sleep,ram_in_use,edge_concurrent_connection. - Event (per-request):
rqs_clock.
- Sampled (time-series):
Distributions: poisson, normal, log_normal, exponential, uniform
Node types: generator, server, client, load_balancer (fixed by models)
Edge types: network_connection
LB algorithms: round_robin, least_connection
Step kinds:
- CPU:
initial_parsing,cpu_bound_operation - RAM:
ram - I/O:
io_task_spawn,io_llm,io_wait,io_db,io_cacheStep operation keys:cpu_time,io_waiting_time,necessary_ramSampled metrics:ready_queue_len,event_loop_io_sleep,ram_in_use,edge_concurrent_connectionEvent metrics:rqs_clock(andllm_costreserved)
Units & conventions
- Time: seconds (
cpu_time,io_waiting_time, latencies,total_simulation_time,sample_period_s,user_sampling_window) - RAM: megabytes (
ram_mb,necessary_ram) - Rates: requests/minute (
avg_request_per_minute_per_user.mean) - Probabilities:
[0.0, 1.0](dropout_rate) - IDs: strings; must be unique within their category
meanis numeric (int|float) and coerced tofloat.- If
distribution ∈ {NORMAL, LOG_NORMAL}andvariance is None→variance := mean. avg_request_per_minute_per_user.distribution == POISSON.avg_active_users.distribution ∈ {POISSON, NORMAL}.user_sampling_window ∈ [1, 120]seconds.typefields default to the correct enum (generator) and are strongly typed.
-
endpoint_nameis normalized to lowercase. -
Each
Stephas exactly onestep_operationkey. -
Step.kindandstep_operationkey must match:- CPU ↔
cpu_time - RAM ↔
necessary_ram - I/O ↔
io_waiting_time
- CPU ↔
-
All step operation values are strictly positive.
Client.type == client,Server.type == server,LoadBalancer.type == load_balancer(enforced).NodesResourcesobey lower bounds:cpu_cores ≥ 1,ram_mb ≥ 256.TopologyNodescontains unique ids acrossclient,servers[], and (optional)load_balancer. Duplicates →ValueError.TopologyNodesforbids unknown fields (extra="forbid").
- No self-loops:
source != target. - Latency sanity:
latency.mean > 0; ifvarianceis provided,variance ≥ 0. Error messages reference the edge id. dropout_rate ∈ [0, 1].
-
Edge ids are unique.
-
Targets are always declared node ids.
-
External ids (e.g., generator) are allowed only as sources; they must never appear as targets.
-
Load balancer integrity:
server_coveredis a subset of declared servers.- Every covered server has a corresponding edge from the LB (LB → srv). Missing links →
ValueError.
-
Fan-out restriction: among declared nodes, only the LB can have multiple outgoing edges. Offenders are listed.
If your payload passes validation, the engine can wire and run the simulation deterministically with consistent semantics.