Prebuild/feat/autonomous agents merge#1588
Conversation
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
- Remove unused os import from main.py - Fix import ordering in health.py and tasks.py (ruff I001) - Add ruff as dev dependency to pyproject.toml - Add uv.lock for reproducible installs - Rewrite README.md with full documentation Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
- Fix duplicate TaskRun bug: fire_webhook_task now delegates entirely to
_execute_task which creates the single canonical run record and returns it
- Fix A2A protocol mismatch: change method tasks/send → message/send, parts
kind type → kind, add messageId/contextId per Google A2A spec
- Forward task.agent and task.llm_provider through invoke_agent to supervisor
metadata so the supervisor routes to the correct sub-agent
- Move import json from inside function bodies to module level (a2a_client.py,
webhooks.py)
- Replace assert isinstance with explicit isinstance checks + HTTPException/log
- Use collections.deque(maxlen=500) for O(1) bounded run history
- Fix open CORS default ["*"] → [] (security)
- Add IntervalTrigger model_validator requiring at least one positive field
- Use Field(default_factory=dict) for mutable dict defaults in models
- Remove unused WebhookTrigger.path field (route is always /hooks/{task_id})
- Remove duplicate ruff from [project.optional-dependencies].dev in pyproject.toml
- Fix Dockerfile: COPY uv.lock, remove unused uv venv .venv line
Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/03dc3b5a-c94f-4f81-bf5f-531161938700
Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>
- IntervalTrigger validator: check each field individually for positive values rather than summing (so seconds=-60, minutes=1 is correctly rejected) - Log effective_llm_provider alongside agent in a2a_client invoke log Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/03dc3b5a-c94f-4f81-bf5f-531161938700 Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>
…2A client Add a third extraction fallback that checks the task history for the last agent message, matching the pattern in a2a_remote_agent_connect.py. Without this, valid supervisor responses carried in history were treated as failures. Also add params.configuration with blocking:true and acceptedOutputModes to ensure the supervisor returns a completed result in a predictable shape. Signed-off-by: A-makarim <$(git config user.email)> Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Replace `pip install uv` + `uv pip install --system -e .` with the repo's established pattern: copy uv from ghcr.io/astral-sh/uv:latest and run `uv sync --locked --no-dev`, which actually enforces the lock file and keeps the install consistent with every other service Dockerfile in the repo. Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/6cc475fd-57ea-4f3b-be49-d66b586733f2 Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>
fix(autonomous-agents): address critical bugs and review feedback
…ggers Narrow the trigger `type` fields from plain `TriggerType` defaults to `Literal[TriggerType.*]` on all three trigger models. Pydantic v2 requires a `Literal`-typed discriminator field for `Field(..., discriminator="type")` to resolve correctly at parse time; without it the union falls back to slow left-to-right probing and can silently mis-classify trigger payloads. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
…finition Cover CronTrigger, IntervalTrigger, WebhookTrigger, and TaskDefinition construction and field defaults. Aligns with the removal of WebhookTrigger.path (dropped in the Copilot bug-fix pass) and the Literal discriminator types added to all trigger models. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Without this, pytest cannot resolve the autonomous_agents package because the source lives under src/ (src layout). Adding pythonpath puts src/ on sys.path so both test runs and IDE import resolution work correctly. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Adds a living checklist that captures the phased plan for evolving the
autonomous_agents service into a production-ready, UI-integrated feature.
Each item carries an ID (IMP-NN), status, rationale ("why it matters"),
suggested approach, and the files it would touch — so any one of them
can be picked up independently without re-deriving the design context.
Phases tracked:
- Phase 1: hardening (persistence, retries, OTel, CORS, secrets, etc.)
- Phase 2: production readiness + UI integration (the north star)
- Phase 3: scale & resilience (jobstore, leader election, multi-replica)
No code changes; documentation only.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Adds scripts/run_supervisor_local.ps1 — a self-contained PowerShell
helper to bring up the CAIPE supervisor in single-node mode on Windows,
purely for end-to-end testing of the autonomous_agents service against
a live supervisor (no Docker, no MongoDB, no other infra required).
The script encapsulates every Windows-specific workaround needed to run
the supervisor natively, so we never have to patch tracked upstream
files outside the autonomous_agents folder:
1. Sets PYTHONUTF8=1 / PYTHONIOENCODING=utf-8 so prompts.py and the
supervisor's connectivity table can read/print UTF-8 content
(emojis, box-drawing) without hitting cp1252 encode/decode errors.
2. Sets PYTHONPATH to the repo root before changing directory, then
cds into charts/ai-platform-engineering/data so the supervisor's
relative-path load of prompt_config.yaml resolves to the real
config (the repo-root prompt_config.yaml is a Docker-mount stub).
3. Bootstraps a .pth file inside the active venv exposing every
ai_platform_engineering/agents/* sub-package, so the single-node
supervisor's eager imports of agent classes succeed without us
having to install each agent as an editable dependency or modify
the root pyproject.toml.
Scope is intentionally limited to the autonomous_agents folder — this
is a developer convenience, not part of the public surface of the
feature.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Introduces a small async Protocol-based abstraction for persisting
TaskRun records, with a default in-memory implementation that mirrors
the legacy deque(maxlen=500) behaviour from scheduler.py.
This is the first slice of IMP-01 (persist run history to MongoDB).
It deliberately ships the abstraction *before* either implementation
is wired into the scheduler so each step lands as a small, reviewable
commit and the scheduler swap (later) becomes a pure refactor.
Protocol surface (services/run_store.py):
- record(run) upsert by run_id; same call site for
RUNNING -> SUCCESS|FAILED transitions
- list_by_task(task_id, limit=100) newest first
- list_all(limit=500) newest first
InMemoryRunStore:
- Bounded by maxlen (default 500), FIFO eviction
- dict + deque pair: O(1) insert/upsert, O(n) filter
- Update path never triggers eviction (unlike a naive append)
- Asyncio-safe under a single-loop driver (FastAPI + APScheduler)
Test coverage (tests/test_run_store.py, 11 tests):
- Protocol conformance (runtime_checkable)
- Newest-first ordering for list_all and list_by_task
- Upsert semantics
- Filtering and limit honoring (including 0 / negative)
- Eviction order
- Eviction does not fire on updates to existing runs
No call sites are modified yet; the new module is introduced behind
its Protocol and will be wired into the scheduler in a follow-up
commit on this branch.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Adds the MongoDB-backed RunStore implementation announced by the
Protocol introduced in the previous commit. Like InMemoryRunStore it
is fully self-contained and not yet wired into the scheduler — the
swap is deferred to a later commit on this branch so each step
remains small and reviewable.
Implementation (services/run_store.py):
- MongoRunStore takes a pre-built motor client (caller-owned
lifecycle, easy to test by injecting AsyncMongoMockClient).
- record() uses replace_one(upsert=True) keyed by the run's
pinned _id so RUNNING -> SUCCESS|FAILED transitions update in
place rather than producing duplicate documents.
- list_all / list_by_task return newest-first, capped by `limit`,
using cursor sort + limit (no in-memory slicing).
- ensure_indexes() is idempotent and creates:
* unique index on `run_id`
* compound index on `(task_id ASC, started_at DESC)` to
support both list_by_task and list_all without a scan.
- Schema is intentionally identical to TaskRun.model_dump()
output so future model fields (prompt, agent, llm_provider,
duration_ms, etc.) flow through automatically.
Test coverage (tests/test_mongo_run_store.py, 13 tests using
mongomock_motor.AsyncMongoMockClient — no real MongoDB required):
- Constructor input validation (empty db / collection name).
- Protocol conformance (runtime_checkable).
- Default collection name pinned to "autonomous_runs".
- ensure_indexes() idempotency and resulting key specs.
- Newest-first ordering for list_all and list_by_task using
explicitly spaced timestamps (avoids BSON's 1ms sort ambiguity
in tight insert loops).
- Upsert-in-place semantics (no duplicate docs after re-record).
- Limit honoring (including 0 / negative).
- Collection isolation (two stores on the same client see only
their own data).
Dependencies (pyproject.toml + uv.lock regenerated):
- motor==3.7.1 (runtime — async MongoDB driver, brings pymongo)
- mongomock-motor==0.0.36 (dev — in-memory mock for tests)
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Wires the new RunStore implementations to configuration without yet
swapping the scheduler over. After this commit the service still
behaves exactly as it does today; it just learns *how* to construct
the right store for a given environment.
Settings additions (config.py):
- mongodb_uri (env: MONGODB_URI) — optional
- mongodb_database (env: MONGODB_DATABASE) — optional
- mongodb_collection (env: MONGODB_COLLECTION) — defaults to
"autonomous_runs" so the operator only needs
URI + DATABASE for the common case
- run_history_maxlen (env: RUN_HISTORY_MAXLEN) — bound for the
in-memory fallback, defaults to 500
Factory (services/run_store.create_run_store):
- Returns MongoRunStore iff *both* mongodb_uri and mongodb_database
are provided; otherwise InMemoryRunStore.
- Partial Mongo config (URI without DATABASE or vice versa) is
treated as "Mongo not configured" — silently engaging Mongo on
half-config would mask typical env-var typos and write to the
wrong place; silently falling back to in-memory on half-config
would lose history. Either misbehaviour is operationally worse
than the current "explicit both-or-neither" rule.
- No network I/O at construction time (motor's AsyncIOMotorClient
is lazy), so the factory is safe to call from tests and from
the FastAPI lifespan startup hook.
- Settings are passed *explicitly* (not pulled from get_settings()
inside the factory). This keeps create_run_store reusable
outside the FastAPI app context and keeps the unit tests free
of monkeypatching.
Test coverage (tests/test_run_store_factory.py, 8 tests):
- In-memory fallback when no Mongo settings, when only URI is set,
when only DATABASE is set, and when URI is the empty string.
- Mongo store selection when both settings are provided.
- in_memory_maxlen and mongodb_collection are forwarded correctly.
- Each call returns a fresh instance (no accidental memoisation).
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Replaces the scheduler's hard-coded ``deque(maxlen=500)`` with the
RunStore abstraction added in the preceding commits. After this
commit the service uses MongoDB for run history when MONGODB_URI +
MONGODB_DATABASE are set, and the legacy in-memory behaviour
(bounded by RUN_HISTORY_MAXLEN, default 500) when they are not —
*identical* to today's behaviour for any developer who hasn't
opted into Mongo.
Combines what was originally planned as two commits because
``get_run_history()`` is sync but ``RunStore.list_*`` is async; a
sync->async transition can't be split cleanly without leaving the
codebase in a non-working intermediate state for one commit.
scheduler.py:
- Drops the module-level ``_run_history`` deque and the ``deque``
import.
- Adds ``_run_store: RunStore | None`` plus ``get_run_store()``
(lazy InMemoryRunStore default) and ``set_run_store(store)``
for the lifespan to inject the configured store.
- ``_execute_task`` now awaits ``store.record(run)`` twice — once
when the run starts (RUNNING) and once when it finishes
(SUCCESS|FAILED). Because RunStore.record is upsert-by-run_id
this updates the same entry rather than creating duplicates.
routes/tasks.py:
- ``/tasks/{id}/runs`` and ``/runs`` await store reads instead of
iterating the in-memory deque. The 404 fallback for
``/tasks/{id}/runs`` (only 404 if BOTH unknown task AND no
historical runs) is preserved verbatim — useful for inspecting
runs of tasks whose definition was removed from config.yaml.
main.py:
- The lifespan startup hook builds the appropriate store via
``create_run_store(...)``, calls ``ensure_indexes()`` on it
when it's a MongoRunStore, logs which backend is active, then
injects it into the scheduler module via ``set_run_store()``.
Test coverage (tests/test_scheduler_run_store.py, 6 tests; mocks
``invoke_agent`` so no live supervisor is needed):
- ``get_run_store`` lazy default + injection via ``set_run_store``.
- ``_execute_task`` records exactly one entry on success (upsert
not duplicate) with the terminal SUCCESS state.
- Same on failure with the error message captured.
- The RUNNING state is visible to the store *while* invoke_agent
is in flight (not only after completion) — this is the value of
recording twice.
- The TaskRun returned by ``_execute_task`` is the same object
as the one in the store; webhook callers depend on this for
synchronous response payloads.
Behavioural impact:
- Operators who set MONGODB_URI + MONGODB_DATABASE now get
persistent, unbounded run history with proper indexes.
- Operators who don't see no change.
- The /api/v1/runs and /api/v1/tasks/{id}/runs JSON shape is
unchanged (still ``list[TaskRun]``).
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Closes IMP-01 (Persist run history to MongoDB) by:
- Adding a "Run History Persistence" section to the README that
explains the two backends (in-memory default vs MongoDB),
when each is selected, the fallback rule for partial Mongo
config, the schema, the indexes, and the startup log lines
operators can grep for.
- Listing the four new env vars (`MONGODB_URI`,
`MONGODB_DATABASE`, `MONGODB_COLLECTION`, `RUN_HISTORY_MAXLEN`)
in the existing Environment Variables table.
- Removing the IMP-01 entry from the active queue in
IMPROVEMENTS.md and recording it under `## Done` with a brief
summary of what landed (touched files, tests, tooling) so the
audit trail survives even after the entry is eventually
deleted.
No code changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Removes the unused ``import pytest`` that was tripping the project's Ruff F401 check (and, transitively, I001 for the now-misordered import block). The tests in this module use only plain ``assert`` statements and Pydantic constructors, so ``pytest`` was never needed as a name in the module namespace. Pre-existing baseline warning surfaced by CI on PR cnoe-io#3; fixing it unblocks the linter check for the rest of the IMP-01 follow-up commits on this branch. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor
A flaky RunStore (e.g. transient MongoDB outage, network blip) used to abort the scheduled job entirely because the very first ``await store.record(run)`` ran *outside* any try/except. Worse, since the same coroutine is awaited synchronously by the webhook router, a Mongo hiccup would surface to external callers as an HTTP 500 — turning observability infrastructure into a single point of failure for the agent execution path. Wrap both record() calls (start-of-run and finally-block) in a new ``_record_safely`` helper that logs at ERROR but never re-raises. The task itself remains the source of truth for whether work happened; persistence is best-effort observability. Test coverage: - A pathological RunStore that raises on every record() no longer prevents the task from completing successfully. - Both failed record() attempts are still logged at ERROR so operators can react. - The TaskRun returned from _execute_task remains fully populated even when the terminal record() blows up (the webhook router echoes this object back to HTTP clients). Codex review feedback on PR cnoe-io#3 (P1). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor
…{id}/runs
Pre-IMP-01 the in-memory deque retained up to 500 runs across all
tasks and ``GET /tasks/{task_id}/runs`` returned every match for a
given task. After moving to the ``RunStore`` abstraction, the
router started calling ``list_by_task(task_id)`` with no explicit
``limit``, so it picked up the protocol's default of 100 — silently
truncating history for any task with more past runs.
Pass an explicit ``limit=_MAX_TASK_RUNS`` (500, matching the legacy
in-memory cap) so behaviour is identical regardless of which
RunStore implementation is active. The constant lives in the
router module so the limit is visible at the API boundary, and
trivially raisable if/when the UI asks for deeper history.
Test coverage (new ``tests/test_tasks_route.py``):
- Asserts the router calls ``list_by_task`` with the explicit cap,
not the protocol default — direct regression test.
- Confirms a stored task with >100 runs round-trips fully.
- Locks in the existing 404 behaviour for genuinely unknown tasks.
- Locks in that runs for tasks removed from config.yaml are still
inspectable.
- Covers ``/runs`` (list_all) for parity, asserting it uses the
500-default when called without params.
Copilot review feedback on PR cnoe-io#3 (P2).
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
The compound ``(task_id ASC, started_at DESC)`` index supports the
per-task query just fine, but Mongo will not walk a compound index
for a sort unless the query also constrains the leading prefix. So
``GET /runs`` (``find({}).sort([("started_at", -1)])``) was falling
back to a full collection scan plus an in-memory sort — a latent
hot path for the upcoming UI integration that surfaces recent runs.
Add a dedicated single-field ``started_at DESC`` index in
``ensure_indexes`` to back the global listing query, and update the
docstring + README so operators see an accurate index inventory.
The cost is one extra B-tree per collection (small — runs are tiny
documents) for an unbounded improvement in worst-case latency on
collections of any meaningful size.
Test coverage:
- ``test_ensure_indexes_is_idempotent`` extended to assert the new
index is present alongside the existing two; running
``ensure_indexes`` twice in a row remains a no-op.
Codex review feedback on PR cnoe-io#3 (P2).
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
pymongo (and therefore motor) defaults to ``tz_aware=False``: it
strips tzinfo from BSON dates on read and returns naive
``datetime`` objects. The write path uses
``datetime.now(timezone.utc)``, so before this fix every TaskRun
round-tripped through MongoRunStore as:
write: started_at = 2026-04-18T10:00:00+00:00 (tz-aware)
read: started_at = 2026-04-18T10:00:00 (naive)
The asymmetry is a latent footgun:
- Comparing a fresh in-memory TaskRun against one read from
Mongo (e.g. picking the latest of two candidates) raises
``TypeError: can't compare offset-naive and offset-aware
datetimes``.
- JSON serialisation drops the trailing ``+00:00`` suffix, so
the API response shape silently differs depending on whether a
run is hot-from-the-scheduler or fetched from storage.
- When a non-UTC operator looks at the data through any tool that
re-attaches a local tz, the timestamps are misinterpreted.
Build the motor client with ``tz_aware=True, tzinfo=timezone.utc``
in ``create_run_store``. UTC is pinned explicitly so a future host
in a non-UTC tz cannot accidentally turn stored timestamps into
local time.
Test coverage:
- New ``test_mongo_client_is_constructed_with_utc_tzinfo`` patches
``AsyncIOMotorClient`` and asserts both kwargs are passed.
Codex review feedback on PR cnoe-io#3.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
…otor import Two cleanups Codex flagged on PR cnoe-io#3 — both no-ops for runtime behaviour, but they remove guarantees that mislead future readers about the schema and dependency model. 1. Redundant run_id unique index ``MongoRunStore.record`` pins ``_id = run_id``, and Mongo's automatic ``_id_`` index already enforces uniqueness on that field. The explicit ``create_index("run_id", unique=True)`` call duplicated that guarantee at the cost of an extra B-tree on every write. Drop it; uniqueness is preserved by the _id index. README and docstring updated to reflect the new index inventory and call out *why* run_id needs no dedicated index. 2. Misleading "motor optional" comment The original local-import comment claimed motor is optional at import time, but motor is a hard runtime dependency declared in pyproject.toml. The deferred import is still useful — it keeps the protocol/in-memory branches free of motor's import cost and isolates any motor incompatibility to environments that actually try to use Mongo — but the rationale is import-time layering, not optionality. Reworded to say so. Test coverage: - ``test_ensure_indexes_does_not_create_redundant_run_id_index`` — explicit regression: a future developer adding the index back trips this immediately. - ``test_id_field_enforces_run_id_uniqueness`` — proves the _id-based uniqueness still holds after the dedicated index is gone (two ``record()`` calls with the same run_id collapse into one document via upsert). - Existing index idempotency test rewritten for the 2-index inventory. Codex review feedback on PR cnoe-io#3. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor
…agents-mongo-store feat(autonomous-agents): persist run history to MongoDB (IMP-01)
…onential backoff
The A2A client previously hard-coded a 300s timeout and zero retries, so a
single 502 from a restarting supervisor failed the whole run permanently.
Wrap the call in tenacity.AsyncRetrying with wait_exponential_jitter and
classify failures so we only retry the ones replay can actually fix:
* httpx.TransportError -> retry (no response was produced)
* httpx.HTTPStatusError 5xx -> retry (supervisor unhealthy)
* httpx.HTTPStatusError 4xx -> propagate (caller-fault, replay wastes
LLM quota without changing the outcome)
* anything else -> propagate (don't mask real bugs)
Total attempts per call = 1 + A2A_MAX_RETRIES. Each retry is logged at
WARNING via tenacity.before_sleep_log so retries stay visible to operators.
New Settings fields, all validated to reject non-positive / inf / NaN:
- A2A_TIMEOUT_SECONDS (default 300)
- A2A_MAX_RETRIES (default 3, 0 disables retries)
- A2A_RETRY_BACKOFF_INITIAL_SECONDS (default 1.0)
- A2A_RETRY_BACKOFF_MAX_SECONDS (default 30.0, caps the backoff)
Coverage: 16 new tests across test_a2a_client.py and test_config.py for
the retry classifier, attempt budget exhaustion, the no-retry-on-4xx
guarantee, the A2A error-envelope path, and Settings validation bounds.
The httpx layer is mocked at _post_once so tests are fast and have no
network dependency.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Service-wide A2A retry/timeout settings are a sensible global default but not always the right policy per task. A nightly synthesis prompt may legitimately need a larger timeout than a 30-second status check; an expensive "best-effort, do not burn quota" task may want max_retries=0 even when the rest of the system is happy to retry. Add two optional fields to TaskDefinition: - timeout_seconds: float | None (must be > 0 when set) - max_retries: int | None (must be >= 0 when set; 0 means "no retries") When None (the default), the scheduler falls back to the global A2A_TIMEOUT_SECONDS / A2A_MAX_RETRIES from Settings. The scheduler now forwards both values into invoke_agent so the existing per-call override plumbing in a2a_client picks them up unchanged. Coverage: 6 new tests in test_models.py covering the default-None behaviour, accepting valid overrides, max_retries=0 being explicitly allowed (it is a real "no retry" signal, not a bug), and the validators rejecting negative max_retries and non-positive timeouts. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor
…ides; cut IMP-02
README:
- Add four new env-var rows: A2A_TIMEOUT_SECONDS, A2A_MAX_RETRIES,
A2A_RETRY_BACKOFF_INITIAL_SECONDS, A2A_RETRY_BACKOFF_MAX_SECONDS.
- Show the optional timeout_seconds / max_retries fields in the
sample task config.yaml so operators see them in context.
- New "Supervisor call reliability" section with the failure
classification table (transport + 5xx retried; 4xx propagated; bug
types propagated) so it is unambiguous what gets retried and why.
IMPROVEMENTS.md:
- Cut IMP-02 from Phase 1 and move the audit entry to ## Done with
the shipping branch and the concrete list of what landed (deps,
Settings, models, scheduler wiring, tests, docs).
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
Pydantic's gt=0 constraint accepts float('inf'), and PyYAML happily parses
.inf / .nan straight from config.yaml. Either would silently propagate
into httpx and break the per-attempt timeout at runtime.
Adds the same finiteness guard to TaskDefinition.timeout_seconds that
Settings.a2a_timeout_seconds already had, plus a parametric test
covering inf, -inf, and nan. Per-task overrides and the global default
now share the same validation contract.
Addresses Copilot review on PR cnoe-io#4.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
Made-with: Cursor
…les in dynamics agents. Tidy up comments and class structure. Remove redundant codes
Standardized the structure of all test files to align with the format used in dynamic agents. This includes tidying up comments, class structures, and removing redundant code to enhance readability and maintainability. Signed-off-by: Ted Tang nuotangidle7@gmail.com
…iments Revert the test isolation conftest and Settings.__init__ rewrites that attempted to fix CORS-related test failures. Production config.py is restored to the pre-experiment state. Also clean up stale test files superseded by merged versions in the earlier test-cleanup pass (test_scheduler_*, test_tasks_crud_route) and remove the unused run_supervisor_local.ps1 dev script. Signed-off-by: tneverlandz7 <nuotangidle7@gmail.com>
The page previously described a different prototype (WebSocket-based WDM bot). Rewrite it to document the current in-process inbound bridge: endpoint at POST /api/v1/hooks/webex/events on the autonomous-agents service (port 8002), required and optional env vars, local ngrok setup, end-to-end verification steps, and the failure-mode contract. Signed-off-by: tneverlandz7 <nuotangidle7@gmail.com>
… service Removed the standalone `webex_bot` service and integrated its functionality directly into the `autonomous-agents` service. This change simplifies the architecture by eliminating the need for cross-process communication and HMAC verification, as the dispatcher now operates in-process. Updated relevant documentation and configuration to reflect this integration. Signed-off-by: Your Name <your.email@example.com>
…g, extended breaker
…functions namesand comments
…merge' into prebuild/feat/autonomous-agents-merge Signed-off-by: Thun78 <kitichartbcc@gmail.com>
…t, chat duplicate bug
Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>
58c4446 to
19a681d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 19a681ddac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| history = await get_run_store().list_by_task(task_id, limit=_MAX_TASK_RUNS) | ||
| if history: | ||
| return history |
There was a problem hiding this comment.
Gate run-history reads by task ownership
When the UI proxy forwards any authenticated user to /tasks/{id}/runs, this endpoint returns list_by_task before loading the task or calling _assert_task_access. In the per-user ownership model, a logged-in user who knows or guesses another task id can read that task's run history, including prompts, response previews, errors, and captured events; the /runs endpoint below exposes the same data across all tasks. Please apply the same caller/ownership check used by get/update/delete/trigger before returning history.
Useful? React with 👍 / 👎.
| caller_email, _ = _get_caller(request) | ||
| if caller_email and task.owner_id is None: | ||
| task = task.model_copy(update={"owner_id": caller_email}) |
There was a problem hiding this comment.
Stamp new tasks with the authenticated owner
For proxied requests caller_email is set, but this only stamps owner_id when the client omitted it. A non-admin can POST a task with owner_id set to another user's email, causing the task to be stored under that user's ownership and appear in their task list while the creator avoids ownership/audit attribution. Since ownership is the authorization boundary, authenticated creates should ignore/reject client-supplied owner ids and always set it from the trusted header unless this is a deliberate admin-only path.
Useful? React with 👍 / 👎.
suwhang-cisco
left a comment
There was a problem hiding this comment.
Code looks good thanks! A few comments / questions -
- Could you please add two new CI files for the new autonomous agent image like these two?
- I see there is a new env var
ENABLE_AUTONOMOUS_AGENTS, but is there a way where we can enable it but only allow certain user groups / admin to use have access to these autonomous agents?
There was a problem hiding this comment.
Could we move this file into build/: https://github.com/cnoe-io/ai-platform-engineering/tree/main/build where other dockerfiles live?
Description
This PR introduces the autonomous task feature set and the supporting platform changes needed to run it end to end.
The main change is a new autonomous task workflow that lets users create scheduled or webhook-triggered tasks, route those runs through the existing supervisor/dynamic-agent A2A path, and review the resulting execution history from the chat UI. The work includes task persistence, scheduler reload behavior, webhook security, per-task chat context, preflight acknowledgement, and UI flows for creating, editing, deleting, filtering, and continuing autonomous task conversations.
It also includes the related integration work needed for the feature to operate in the current platform shape: supervisor tools for autonomous task management, GitHub webhook setup helpers, dynamic-agent chat timeline rendering fixes, MongoDB-backed task storage, deployment/env wiring, Helm/prebuild workflow updates, and CI fixes found while preparing the merge branch.
Notable areas included in this branch:
Type of Change
Pre-release Helm Charts (Optional)
This branch includes chart and prebuild workflow changes. Prebuild chart publishing has been exercised from the fork branch.
Checklist