Prebuild/feat/autonomous agents merge by A-makarim · Pull Request #1588 · cnoe-io/ai-platform-engineering

A-makarim · 2026-05-27T15:40:00Z

Description

This PR introduces the autonomous task feature set and the supporting platform changes needed to run it end to end.

The main change is a new autonomous task workflow that lets users create scheduled or webhook-triggered tasks, route those runs through the existing supervisor/dynamic-agent A2A path, and review the resulting execution history from the chat UI. The work includes task persistence, scheduler reload behavior, webhook security, per-task chat context, preflight acknowledgement, and UI flows for creating, editing, deleting, filtering, and continuing autonomous task conversations.

It also includes the related integration work needed for the feature to operate in the current platform shape: supervisor tools for autonomous task management, GitHub webhook setup helpers, dynamic-agent chat timeline rendering fixes, MongoDB-backed task storage, deployment/env wiring, Helm/prebuild workflow updates, and CI fixes found while preparing the merge branch.

Notable areas included in this branch:

Autonomous task CRUD, scheduling, hot reload, and MongoDB-backed task storage.
Webhook-triggered autonomous runs with signing, replay-window protection, ping handling, and GitHub webhook management tools.
Supervisor/deep-agent integration so autonomous runs and follow-up chat share the normal A2A execution pipeline.
UI support for an Autonomous tab, task management, task-linked chat threads, AUTO filtering/badges, replayed timelines, and scheduled-run updates.
Tests around task models, preflight handling, webhook management, chat rendering, and related UI behavior.
Deployment and configuration updates for autonomous agent public/internal URLs and prebuild Helm chart publication.

Type of Change

Pre-release Helm Charts (Optional)

This branch includes chart and prebuild workflow changes. Prebuild chart publishing has been exercised from the fork branch.

Checklist

I have read the contributing guidelines
Existing issues have been referenced (where applicable)
I have verified this change is not present in other open pull requests
Functionality is documented
All code style checks pass
New code contribution is covered by automated tests
All new and existing tests pass

Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

- Remove unused os import from main.py - Fix import ordering in health.py and tasks.py (ruff I001) - Add ruff as dev dependency to pyproject.toml - Add uv.lock for reproducible installs - Rewrite README.md with full documentation Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

- Fix duplicate TaskRun bug: fire_webhook_task now delegates entirely to _execute_task which creates the single canonical run record and returns it - Fix A2A protocol mismatch: change method tasks/send → message/send, parts kind type → kind, add messageId/contextId per Google A2A spec - Forward task.agent and task.llm_provider through invoke_agent to supervisor metadata so the supervisor routes to the correct sub-agent - Move import json from inside function bodies to module level (a2a_client.py, webhooks.py) - Replace assert isinstance with explicit isinstance checks + HTTPException/log - Use collections.deque(maxlen=500) for O(1) bounded run history - Fix open CORS default ["*"] → [] (security) - Add IntervalTrigger model_validator requiring at least one positive field - Use Field(default_factory=dict) for mutable dict defaults in models - Remove unused WebhookTrigger.path field (route is always /hooks/{task_id}) - Remove duplicate ruff from [project.optional-dependencies].dev in pyproject.toml - Fix Dockerfile: COPY uv.lock, remove unused uv venv .venv line Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/03dc3b5a-c94f-4f81-bf5f-531161938700 Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>

- IntervalTrigger validator: check each field individually for positive values rather than summing (so seconds=-60, minutes=1 is correctly rejected) - Log effective_llm_provider alongside agent in a2a_client invoke log Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/03dc3b5a-c94f-4f81-bf5f-531161938700 Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>

…2A client Add a third extraction fallback that checks the task history for the last agent message, matching the pattern in a2a_remote_agent_connect.py. Without this, valid supervisor responses carried in history were treated as failures. Also add params.configuration with blocking:true and acceptedOutputModes to ensure the supervisor returns a completed result in a predictable shape. Signed-off-by: A-makarim <$(git config user.email)> Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

Replace `pip install uv` + `uv pip install --system -e .` with the repo's established pattern: copy uv from ghcr.io/astral-sh/uv:latest and run `uv sync --locked --no-dev`, which actually enforces the lock file and keeps the install consistent with every other service Dockerfile in the repo. Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Agent-Logs-Url: https://github.com/A-makarim/ai-platform-engineering/sessions/6cc475fd-57ea-4f3b-be49-d66b586733f2 Co-authored-by: A-makarim <114302821+A-makarim@users.noreply.github.com>

fix(autonomous-agents): address critical bugs and review feedback

…ggers Narrow the trigger `type` fields from plain `TriggerType` defaults to `Literal[TriggerType.*]` on all three trigger models. Pydantic v2 requires a `Literal`-typed discriminator field for `Field(..., discriminator="type")` to resolve correctly at parse time; without it the union falls back to slow left-to-right probing and can silently mis-classify trigger payloads. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

…finition Cover CronTrigger, IntervalTrigger, WebhookTrigger, and TaskDefinition construction and field defaults. Aligns with the removal of WebhookTrigger.path (dropped in the Copilot bug-fix pass) and the Literal discriminator types added to all trigger models. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

Without this, pytest cannot resolve the autonomous_agents package because the source lives under src/ (src layout). Adding pythonpath puts src/ on sys.path so both test runs and IDE import resolution work correctly. Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

Adds a living checklist that captures the phased plan for evolving the autonomous_agents service into a production-ready, UI-integrated feature. Each item carries an ID (IMP-NN), status, rationale ("why it matters"), suggested approach, and the files it would touch — so any one of them can be picked up independently without re-deriving the design context. Phases tracked: - Phase 1: hardening (persistence, retries, OTel, CORS, secrets, etc.) - Phase 2: production readiness + UI integration (the north star) - Phase 3: scale & resilience (jobstore, leader election, multi-replica) No code changes; documentation only. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Adds scripts/run_supervisor_local.ps1 — a self-contained PowerShell helper to bring up the CAIPE supervisor in single-node mode on Windows, purely for end-to-end testing of the autonomous_agents service against a live supervisor (no Docker, no MongoDB, no other infra required). The script encapsulates every Windows-specific workaround needed to run the supervisor natively, so we never have to patch tracked upstream files outside the autonomous_agents folder: 1. Sets PYTHONUTF8=1 / PYTHONIOENCODING=utf-8 so prompts.py and the supervisor's connectivity table can read/print UTF-8 content (emojis, box-drawing) without hitting cp1252 encode/decode errors. 2. Sets PYTHONPATH to the repo root before changing directory, then cds into charts/ai-platform-engineering/data so the supervisor's relative-path load of prompt_config.yaml resolves to the real config (the repo-root prompt_config.yaml is a Docker-mount stub). 3. Bootstraps a .pth file inside the active venv exposing every ai_platform_engineering/agents/* sub-package, so the single-node supervisor's eager imports of agent classes succeed without us having to install each agent as an editable dependency or modify the root pyproject.toml. Scope is intentionally limited to the autonomous_agents folder — this is a developer convenience, not part of the public surface of the feature. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Introduces a small async Protocol-based abstraction for persisting TaskRun records, with a default in-memory implementation that mirrors the legacy deque(maxlen=500) behaviour from scheduler.py. This is the first slice of IMP-01 (persist run history to MongoDB). It deliberately ships the abstraction *before* either implementation is wired into the scheduler so each step lands as a small, reviewable commit and the scheduler swap (later) becomes a pure refactor. Protocol surface (services/run_store.py): - record(run) upsert by run_id; same call site for RUNNING -> SUCCESS|FAILED transitions - list_by_task(task_id, limit=100) newest first - list_all(limit=500) newest first InMemoryRunStore: - Bounded by maxlen (default 500), FIFO eviction - dict + deque pair: O(1) insert/upsert, O(n) filter - Update path never triggers eviction (unlike a naive append) - Asyncio-safe under a single-loop driver (FastAPI + APScheduler) Test coverage (tests/test_run_store.py, 11 tests): - Protocol conformance (runtime_checkable) - Newest-first ordering for list_all and list_by_task - Upsert semantics - Filtering and limit honoring (including 0 / negative) - Eviction order - Eviction does not fire on updates to existing runs No call sites are modified yet; the new module is introduced behind its Protocol and will be wired into the scheduler in a follow-up commit on this branch. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Adds the MongoDB-backed RunStore implementation announced by the Protocol introduced in the previous commit. Like InMemoryRunStore it is fully self-contained and not yet wired into the scheduler — the swap is deferred to a later commit on this branch so each step remains small and reviewable. Implementation (services/run_store.py): - MongoRunStore takes a pre-built motor client (caller-owned lifecycle, easy to test by injecting AsyncMongoMockClient). - record() uses replace_one(upsert=True) keyed by the run's pinned _id so RUNNING -> SUCCESS|FAILED transitions update in place rather than producing duplicate documents. - list_all / list_by_task return newest-first, capped by `limit`, using cursor sort + limit (no in-memory slicing). - ensure_indexes() is idempotent and creates: * unique index on `run_id` * compound index on `(task_id ASC, started_at DESC)` to support both list_by_task and list_all without a scan. - Schema is intentionally identical to TaskRun.model_dump() output so future model fields (prompt, agent, llm_provider, duration_ms, etc.) flow through automatically. Test coverage (tests/test_mongo_run_store.py, 13 tests using mongomock_motor.AsyncMongoMockClient — no real MongoDB required): - Constructor input validation (empty db / collection name). - Protocol conformance (runtime_checkable). - Default collection name pinned to "autonomous_runs". - ensure_indexes() idempotency and resulting key specs. - Newest-first ordering for list_all and list_by_task using explicitly spaced timestamps (avoids BSON's 1ms sort ambiguity in tight insert loops). - Upsert-in-place semantics (no duplicate docs after re-record). - Limit honoring (including 0 / negative). - Collection isolation (two stores on the same client see only their own data). Dependencies (pyproject.toml + uv.lock regenerated): - motor==3.7.1 (runtime — async MongoDB driver, brings pymongo) - mongomock-motor==0.0.36 (dev — in-memory mock for tests) Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Wires the new RunStore implementations to configuration without yet swapping the scheduler over. After this commit the service still behaves exactly as it does today; it just learns *how* to construct the right store for a given environment. Settings additions (config.py): - mongodb_uri (env: MONGODB_URI) — optional - mongodb_database (env: MONGODB_DATABASE) — optional - mongodb_collection (env: MONGODB_COLLECTION) — defaults to "autonomous_runs" so the operator only needs URI + DATABASE for the common case - run_history_maxlen (env: RUN_HISTORY_MAXLEN) — bound for the in-memory fallback, defaults to 500 Factory (services/run_store.create_run_store): - Returns MongoRunStore iff *both* mongodb_uri and mongodb_database are provided; otherwise InMemoryRunStore. - Partial Mongo config (URI without DATABASE or vice versa) is treated as "Mongo not configured" — silently engaging Mongo on half-config would mask typical env-var typos and write to the wrong place; silently falling back to in-memory on half-config would lose history. Either misbehaviour is operationally worse than the current "explicit both-or-neither" rule. - No network I/O at construction time (motor's AsyncIOMotorClient is lazy), so the factory is safe to call from tests and from the FastAPI lifespan startup hook. - Settings are passed *explicitly* (not pulled from get_settings() inside the factory). This keeps create_run_store reusable outside the FastAPI app context and keeps the unit tests free of monkeypatching. Test coverage (tests/test_run_store_factory.py, 8 tests): - In-memory fallback when no Mongo settings, when only URI is set, when only DATABASE is set, and when URI is the empty string. - Mongo store selection when both settings are provided. - in_memory_maxlen and mongodb_collection are forwarded correctly. - Each call returns a fresh instance (no accidental memoisation). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Replaces the scheduler's hard-coded ``deque(maxlen=500)`` with the RunStore abstraction added in the preceding commits. After this commit the service uses MongoDB for run history when MONGODB_URI + MONGODB_DATABASE are set, and the legacy in-memory behaviour (bounded by RUN_HISTORY_MAXLEN, default 500) when they are not — *identical* to today's behaviour for any developer who hasn't opted into Mongo. Combines what was originally planned as two commits because ``get_run_history()`` is sync but ``RunStore.list_*`` is async; a sync->async transition can't be split cleanly without leaving the codebase in a non-working intermediate state for one commit. scheduler.py: - Drops the module-level ``_run_history`` deque and the ``deque`` import. - Adds ``_run_store: RunStore | None`` plus ``get_run_store()`` (lazy InMemoryRunStore default) and ``set_run_store(store)`` for the lifespan to inject the configured store. - ``_execute_task`` now awaits ``store.record(run)`` twice — once when the run starts (RUNNING) and once when it finishes (SUCCESS|FAILED). Because RunStore.record is upsert-by-run_id this updates the same entry rather than creating duplicates. routes/tasks.py: - ``/tasks/{id}/runs`` and ``/runs`` await store reads instead of iterating the in-memory deque. The 404 fallback for ``/tasks/{id}/runs`` (only 404 if BOTH unknown task AND no historical runs) is preserved verbatim — useful for inspecting runs of tasks whose definition was removed from config.yaml. main.py: - The lifespan startup hook builds the appropriate store via ``create_run_store(...)``, calls ``ensure_indexes()`` on it when it's a MongoRunStore, logs which backend is active, then injects it into the scheduler module via ``set_run_store()``. Test coverage (tests/test_scheduler_run_store.py, 6 tests; mocks ``invoke_agent`` so no live supervisor is needed): - ``get_run_store`` lazy default + injection via ``set_run_store``. - ``_execute_task`` records exactly one entry on success (upsert not duplicate) with the terminal SUCCESS state. - Same on failure with the error message captured. - The RUNNING state is visible to the store *while* invoke_agent is in flight (not only after completion) — this is the value of recording twice. - The TaskRun returned by ``_execute_task`` is the same object as the one in the store; webhook callers depend on this for synchronous response payloads. Behavioural impact: - Operators who set MONGODB_URI + MONGODB_DATABASE now get persistent, unbounded run history with proper indexes. - Operators who don't see no change. - The /api/v1/runs and /api/v1/tasks/{id}/runs JSON shape is unchanged (still ``list[TaskRun]``). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Closes IMP-01 (Persist run history to MongoDB) by: - Adding a "Run History Persistence" section to the README that explains the two backends (in-memory default vs MongoDB), when each is selected, the fallback rule for partial Mongo config, the schema, the indexes, and the startup log lines operators can grep for. - Listing the four new env vars (`MONGODB_URI`, `MONGODB_DATABASE`, `MONGODB_COLLECTION`, `RUN_HISTORY_MAXLEN`) in the existing Environment Variables table. - Removing the IMP-01 entry from the active queue in IMPROVEMENTS.md and recording it under `## Done` with a brief summary of what landed (touched files, tests, tooling) so the audit trail survives even after the entry is eventually deleted. No code changes. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Removes the unused ``import pytest`` that was tripping the project's Ruff F401 check (and, transitively, I001 for the now-misordered import block). The tests in this module use only plain ``assert`` statements and Pydantic constructors, so ``pytest`` was never needed as a name in the module namespace. Pre-existing baseline warning surfaced by CI on PR cnoe-io#3; fixing it unblocks the linter check for the rest of the IMP-01 follow-up commits on this branch. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

A flaky RunStore (e.g. transient MongoDB outage, network blip) used to abort the scheduled job entirely because the very first ``await store.record(run)`` ran *outside* any try/except. Worse, since the same coroutine is awaited synchronously by the webhook router, a Mongo hiccup would surface to external callers as an HTTP 500 — turning observability infrastructure into a single point of failure for the agent execution path. Wrap both record() calls (start-of-run and finally-block) in a new ``_record_safely`` helper that logs at ERROR but never re-raises. The task itself remains the source of truth for whether work happened; persistence is best-effort observability. Test coverage: - A pathological RunStore that raises on every record() no longer prevents the task from completing successfully. - Both failed record() attempts are still logged at ERROR so operators can react. - The TaskRun returned from _execute_task remains fully populated even when the terminal record() blows up (the webhook router echoes this object back to HTTP clients). Codex review feedback on PR cnoe-io#3 (P1). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

…{id}/runs Pre-IMP-01 the in-memory deque retained up to 500 runs across all tasks and ``GET /tasks/{task_id}/runs`` returned every match for a given task. After moving to the ``RunStore`` abstraction, the router started calling ``list_by_task(task_id)`` with no explicit ``limit``, so it picked up the protocol's default of 100 — silently truncating history for any task with more past runs. Pass an explicit ``limit=_MAX_TASK_RUNS`` (500, matching the legacy in-memory cap) so behaviour is identical regardless of which RunStore implementation is active. The constant lives in the router module so the limit is visible at the API boundary, and trivially raisable if/when the UI asks for deeper history. Test coverage (new ``tests/test_tasks_route.py``): - Asserts the router calls ``list_by_task`` with the explicit cap, not the protocol default — direct regression test. - Confirms a stored task with >100 runs round-trips fully. - Locks in the existing 404 behaviour for genuinely unknown tasks. - Locks in that runs for tasks removed from config.yaml are still inspectable. - Covers ``/runs`` (list_all) for parity, asserting it uses the 500-default when called without params. Copilot review feedback on PR cnoe-io#3 (P2). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

The compound ``(task_id ASC, started_at DESC)`` index supports the per-task query just fine, but Mongo will not walk a compound index for a sort unless the query also constrains the leading prefix. So ``GET /runs`` (``find({}).sort([("started_at", -1)])``) was falling back to a full collection scan plus an in-memory sort — a latent hot path for the upcoming UI integration that surfaces recent runs. Add a dedicated single-field ``started_at DESC`` index in ``ensure_indexes`` to back the global listing query, and update the docstring + README so operators see an accurate index inventory. The cost is one extra B-tree per collection (small — runs are tiny documents) for an unbounded improvement in worst-case latency on collections of any meaningful size. Test coverage: - ``test_ensure_indexes_is_idempotent`` extended to assert the new index is present alongside the existing two; running ``ensure_indexes`` twice in a row remains a no-op. Codex review feedback on PR cnoe-io#3 (P2). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

pymongo (and therefore motor) defaults to ``tz_aware=False``: it strips tzinfo from BSON dates on read and returns naive ``datetime`` objects. The write path uses ``datetime.now(timezone.utc)``, so before this fix every TaskRun round-tripped through MongoRunStore as: write: started_at = 2026-04-18T10:00:00+00:00 (tz-aware) read: started_at = 2026-04-18T10:00:00 (naive) The asymmetry is a latent footgun: - Comparing a fresh in-memory TaskRun against one read from Mongo (e.g. picking the latest of two candidates) raises ``TypeError: can't compare offset-naive and offset-aware datetimes``. - JSON serialisation drops the trailing ``+00:00`` suffix, so the API response shape silently differs depending on whether a run is hot-from-the-scheduler or fetched from storage. - When a non-UTC operator looks at the data through any tool that re-attaches a local tz, the timestamps are misinterpreted. Build the motor client with ``tz_aware=True, tzinfo=timezone.utc`` in ``create_run_store``. UTC is pinned explicitly so a future host in a non-UTC tz cannot accidentally turn stored timestamps into local time. Test coverage: - New ``test_mongo_client_is_constructed_with_utc_tzinfo`` patches ``AsyncIOMotorClient`` and asserts both kwargs are passed. Codex review feedback on PR cnoe-io#3. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

…otor import Two cleanups Codex flagged on PR cnoe-io#3 — both no-ops for runtime behaviour, but they remove guarantees that mislead future readers about the schema and dependency model. 1. Redundant run_id unique index ``MongoRunStore.record`` pins ``_id = run_id``, and Mongo's automatic ``_id_`` index already enforces uniqueness on that field. The explicit ``create_index("run_id", unique=True)`` call duplicated that guarantee at the cost of an extra B-tree on every write. Drop it; uniqueness is preserved by the _id index. README and docstring updated to reflect the new index inventory and call out *why* run_id needs no dedicated index. 2. Misleading "motor optional" comment The original local-import comment claimed motor is optional at import time, but motor is a hard runtime dependency declared in pyproject.toml. The deferred import is still useful — it keeps the protocol/in-memory branches free of motor's import cost and isolates any motor incompatibility to environments that actually try to use Mongo — but the rationale is import-time layering, not optionality. Reworded to say so. Test coverage: - ``test_ensure_indexes_does_not_create_redundant_run_id_index`` — explicit regression: a future developer adding the index back trips this immediately. - ``test_id_field_enforces_run_id_uniqueness`` — proves the _id-based uniqueness still holds after the dedicated index is gone (two ``record()`` calls with the same run_id collapse into one document via upsert). - Existing index idempotency test rewritten for the 2-index inventory. Codex review feedback on PR cnoe-io#3. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

…agents-mongo-store feat(autonomous-agents): persist run history to MongoDB (IMP-01)

…onential backoff The A2A client previously hard-coded a 300s timeout and zero retries, so a single 502 from a restarting supervisor failed the whole run permanently. Wrap the call in tenacity.AsyncRetrying with wait_exponential_jitter and classify failures so we only retry the ones replay can actually fix: * httpx.TransportError -> retry (no response was produced) * httpx.HTTPStatusError 5xx -> retry (supervisor unhealthy) * httpx.HTTPStatusError 4xx -> propagate (caller-fault, replay wastes LLM quota without changing the outcome) * anything else -> propagate (don't mask real bugs) Total attempts per call = 1 + A2A_MAX_RETRIES. Each retry is logged at WARNING via tenacity.before_sleep_log so retries stay visible to operators. New Settings fields, all validated to reject non-positive / inf / NaN: - A2A_TIMEOUT_SECONDS (default 300) - A2A_MAX_RETRIES (default 3, 0 disables retries) - A2A_RETRY_BACKOFF_INITIAL_SECONDS (default 1.0) - A2A_RETRY_BACKOFF_MAX_SECONDS (default 30.0, caps the backoff) Coverage: 16 new tests across test_a2a_client.py and test_config.py for the retry classifier, attempt budget exhaustion, the no-retry-on-4xx guarantee, the A2A error-envelope path, and Settings validation bounds. The httpx layer is mocked at _post_once so tests are fast and have no network dependency. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Service-wide A2A retry/timeout settings are a sensible global default but not always the right policy per task. A nightly synthesis prompt may legitimately need a larger timeout than a 30-second status check; an expensive "best-effort, do not burn quota" task may want max_retries=0 even when the rest of the system is happy to retry. Add two optional fields to TaskDefinition: - timeout_seconds: float | None (must be > 0 when set) - max_retries: int | None (must be >= 0 when set; 0 means "no retries") When None (the default), the scheduler falls back to the global A2A_TIMEOUT_SECONDS / A2A_MAX_RETRIES from Settings. The scheduler now forwards both values into invoke_agent so the existing per-call override plumbing in a2a_client picks them up unchanged. Coverage: 6 new tests in test_models.py covering the default-None behaviour, accepting valid overrides, max_retries=0 being explicitly allowed (it is a real "no retry" signal, not a bug), and the validators rejecting negative max_retries and non-positive timeouts. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

…ides; cut IMP-02 README: - Add four new env-var rows: A2A_TIMEOUT_SECONDS, A2A_MAX_RETRIES, A2A_RETRY_BACKOFF_INITIAL_SECONDS, A2A_RETRY_BACKOFF_MAX_SECONDS. - Show the optional timeout_seconds / max_retries fields in the sample task config.yaml so operators see them in context. - New "Supervisor call reliability" section with the failure classification table (transport + 5xx retried; 4xx propagated; bug types propagated) so it is unambiguous what gets retried and why. IMPROVEMENTS.md: - Cut IMP-02 from Phase 1 and move the audit entry to ## Done with the shipping branch and the concrete list of what landed (deps, Settings, models, scheduler wiring, tests, docs). Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

Pydantic's gt=0 constraint accepts float('inf'), and PyYAML happily parses .inf / .nan straight from config.yaml. Either would silently propagate into httpx and break the per-attempt timeout at runtime. Adds the same finiteness guard to TaskDefinition.timeout_seconds that Settings.a2a_timeout_seconds already had, plus a parametric test covering inf, -inf, and nan. Per-task overrides and the global default now share the same validation contract. Addresses Copilot review on PR cnoe-io#4. Assisted-by: Claude:claude-opus-4.7 Signed-off-by: A-makarim <syedmakarim0.2@gmail.com> Made-with: Cursor

…les in dynamics agents. Tidy up comments and class structure. Remove redundant codes

Standardized the structure of all test files to align with the format used in dynamic agents. This includes tidying up comments, class structures, and removing redundant code to enhance readability and maintainability. Signed-off-by: Ted Tang nuotangidle7@gmail.com

…s_origins

…terference

…iments Revert the test isolation conftest and Settings.__init__ rewrites that attempted to fix CORS-related test failures. Production config.py is restored to the pre-experiment state. Also clean up stale test files superseded by merged versions in the earlier test-cleanup pass (test_scheduler_*, test_tasks_crud_route) and remove the unused run_supervisor_local.ps1 dev script. Signed-off-by: tneverlandz7 <nuotangidle7@gmail.com>

The page previously described a different prototype (WebSocket-based WDM bot). Rewrite it to document the current in-process inbound bridge: endpoint at POST /api/v1/hooks/webex/events on the autonomous-agents service (port 8002), required and optional env vars, local ngrok setup, end-to-end verification steps, and the failure-mode contract. Signed-off-by: tneverlandz7 <nuotangidle7@gmail.com>

… service Removed the standalone `webex_bot` service and integrated its functionality directly into the `autonomous-agents` service. This change simplifies the architecture by eliminating the need for cross-process communication and HMAC verification, as the dispatcher now operates in-process. Updated relevant documentation and configuration to reflect this integration. Signed-off-by: Your Name <your.email@example.com>

…utes

…g, extended breaker

…functions namesand comments

…merge' into prebuild/feat/autonomous-agents-merge Signed-off-by: Thun78 <kitichartbcc@gmail.com>

…t, chat duplicate bug

Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 19a681ddac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-27T15:56:38Z

+    history = await get_run_store().list_by_task(task_id, limit=_MAX_TASK_RUNS)
+    if history:
+        return history


Gate run-history reads by task ownership

When the UI proxy forwards any authenticated user to /tasks/{id}/runs, this endpoint returns list_by_task before loading the task or calling _assert_task_access. In the per-user ownership model, a logged-in user who knows or guesses another task id can read that task's run history, including prompts, response previews, errors, and captured events; the /runs endpoint below exposes the same data across all tasks. Please apply the same caller/ownership check used by get/update/delete/trigger before returning history.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-27T15:56:39Z

+    caller_email, _ = _get_caller(request)
+    if caller_email and task.owner_id is None:
+        task = task.model_copy(update={"owner_id": caller_email})


Stamp new tasks with the authenticated owner

For proxied requests caller_email is set, but this only stamps owner_id when the client omitted it. A non-admin can POST a task with owner_id set to another user's email, causing the task to be stored under that user's ownership and appear in their task list while the creator avoids ownership/audit attribution. Since ownership is the authorization boundary, authenticated creates should ignore/reject client-supplied owner ids and always set it from the trusted header unless this is a deliberate admin-only path.

Useful? React with 👍 / 👎.

suwhang-cisco

Code looks good thanks! A few comments / questions -

Could you please add two new CI files for the new autonomous agent image like these two?
1. https://github.com/cnoe-io/ai-platform-engineering/blob/main/.github/workflows/ci-dynamic-agents.yml
2. https://github.com/cnoe-io/ai-platform-engineering/blob/main/.github/workflows/prebuild-dynamic-agents.yml
I see there is a new env var ENABLE_AUTONOMOUS_AGENTS, but is there a way where we can enable it but only allow certain user groups / admin to use have access to these autonomous agents?

suwhang-cisco · 2026-05-28T11:01:42Z

Could we move this file into build/: https://github.com/cnoe-io/ai-platform-engineering/tree/main/build where other dockerfiles live?

A-makarim and others added 30 commits April 10, 2026 17:51

feat: starting new feature for cron autonomous agents

006d9bd

Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

chore: scaffold project structure

6552131

Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

Initial plan

d3c34c4

fix(autonomous-agents): address critical bugs and review feedback

586313e

fix(autonomous-agents): address critical bugs and review feedback

Merge pull request cnoe-io#3 from A-makarim/prebuild/feat/autonomous-…

28d2c00

…agents-mongo-store feat(autonomous-agents): persist run history to MongoDB (IMP-01)

tneverlandz7 and others added 23 commits May 12, 2026 00:38

tidy codes for all test files, making them same structures as test fi…

bfc31a5

…les in dynamics agents. Tidy up comments and class structure. Remove redundant codes

remove useless and combined files.

13b91c3

enhance test environment isolation for Settings

7cb7e8e

refactor(config): update Settings initialization to handle legacy cor…

9a65751

…s_origins

refactor(config): modify Settings initialization to prevent dotenv in…

4f1683f

…terference

refractor routes

0f923fb

Merge branch 'prebuild/feat/autonomous-agents-merge' into refractorRo…

1dae966

…utes

Cleaned up comments and deleted old imports from refractoring

694db13

fixed jira MCP dict error, changed UI webex followup message

760ba5e

deleted unused functions in a2s, routed circuit_breaker into streamin…

8ec5e52

…g, extended breaker

edited comments

fc401ff

added llm.py and modified middleware.py to align with upstream

be36821

fix(lint): remove unused import, variable, and bare f-string

ae33099

removed test only in-memory webex thread fallback, refractor private …

565e8e8

…functions namesand comments

Merge remote-tracking branch 'origin/prebuild/feat/autonomous-agents-…

15f5a80

…merge' into prebuild/feat/autonomous-agents-merge Signed-off-by: Thun78 <kitichartbcc@gmail.com>

fixed chat UI autonomous filter bug, manual chat with autonomous agen…

50013d0

…t, chat duplicate bug

fixed UI bugs

2be9574

fixed further errors

5869f31

Fixed shared autonomous agent chat

1827b1a

github-project-automation Bot added this to CAIPE (AI Platform Engineering) Project Backlog May 27, 2026

Replaced admin-view, with per-user ownership model

19a681d

Signed-off-by: A-makarim <syedmakarim0.2@gmail.com>

A-makarim force-pushed the prebuild/feat/autonomous-agents-merge branch from 58c4446 to 19a681d Compare May 27, 2026 15:49

A-makarim marked this pull request as ready for review May 27, 2026 15:54

A-makarim requested a review from sriaradhyula as a code owner May 27, 2026 15:54

chatgpt-codex-connector Bot reviewed May 27, 2026

View reviewed changes

suwhang-cisco reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prebuild/feat/autonomous agents merge#1588

Prebuild/feat/autonomous agents merge#1588
A-makarim wants to merge 178 commits into
cnoe-io:mainfrom
A-makarim:prebuild/feat/autonomous-agents-merge

A-makarim commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Uh oh!

suwhang-cisco left a comment

Uh oh!

suwhang-cisco May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

A-makarim commented May 27, 2026

Description

Type of Change

Pre-release Helm Charts (Optional)

Checklist

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

suwhang-cisco left a comment

Choose a reason for hiding this comment

Uh oh!

suwhang-cisco May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants