#NNN — Rubric Gap Analysis — pattern-event-after-commit new rule: Event Grid emission must be guarded by state change and succeed the Cosmos DB commit
| Field |
Value |
| Type |
New Rule |
| Target Rule |
(new) pattern-event-after-commit |
| Category |
Design Patterns (pattern-*) |
| Severity |
HIGH |
| Source |
SCOPE Py Serverless EG (serverless-api-azfunc-cosmos-event-grid) V-A — Phase 3 Finding H4 + Phase 4 Category #9 |
| Labels |
enhancement, SCOPE, agent-kit |
Prior-art check
No matching enhancement in PRs #38–commit 53791f0 (2026-04-20). Rule inventory re-synced 2026-04-20 → 94 rules, including the brand-new pattern-ai-grounding-access which addresses a different concern (LLM tool-use point reads). Closest existing Design Patterns rules:
pattern-change-feed-materialized-views — updates via Change Feed / idempotency on the consumer side; no guidance on emission from the producer.
pattern-efficient-ranking, pattern-service-layer-relationships — unrelated.
pattern-ai-grounding-access (NEW 2026-04-20) — LLM-grounding point-read pattern; no overlap.
No rule covers the producer-side obligation to (a) gate event emission on actual state change and (b) order the emit strictly after the Cosmos DB write commits.
Summary
Scenarios that pair a Cosmos DB write with an Event Grid (or any messaging system) publish suffer two canonical bugs: emitting an event when nothing actually changed, and emitting an event before the Cosmos write has committed. Both produce dual-write anomalies observable downstream — duplicate event storms and orphan events for rows that never persisted. Existing skill text does not name either failure mode. This rule codifies the producer-side discipline: guard every emit with an explicit old-vs-new comparison on the field(s) that define the event, and ensure the emit is ordered after the Cosmos DB commit.
Rubric Gap Analysis
Discovered during Phase 3 (Finding H4) and confirmed in Phase 4 Category #9 Design Patterns rollup over 80 L5 snapshots:
| Anti-pattern |
AK hit rate |
Control hit rate |
Combined |
UpdateOrder emits Event Grid event without comparing old vs new status |
12/35 (34%) |
18/45 (40%) |
30/80 (38%) |
| Event emission ordered before Cosmos write commit (dual-write reordering) |
2/35 (6%) |
3/45 (7%) |
5/80 (6%) |
AK profiles do slightly better on the guard (66% vs 60% Control) but 34% of AK runs still miss it — the skill's existing rules do not teach this discipline, so even the agent kit leaves it to model priors. This drives the Phase 4 #9 Design Patterns delta of only +6.8 pp AK over Control, a category where a named rule should produce a much larger gap.
Cross-profile, cross-arm, cross-language: misses appear in Python, JavaScript, and C# runs; no single profile is responsible.
Evidence
Existing Rule Coverage (quoted verbatim)
From pattern-change-feed-materialized-views.md (enhanced via PR #39):
Consumers processing Change Feed entries must be idempotent: a given (partitionKey, id, _ts) may be delivered more than once under at-least-once semantics.
Guidance is entirely consumer-side. Producer-side emission discipline is not addressed.
From pattern-efficient-ranking.md:
Use count-based or cached rank approaches instead of full partition scans for ranking.
Unrelated.
From pattern-service-layer-relationships.md:
Use a service layer to hydrate document references before rendering.
Unrelated.
From pattern-ai-grounding-access.md (NEW 2026-04-20):
In AI-grounded workloads an LLM tool-use loop typically resolves a concrete entity id…
Unrelated; no scenario overlap.
Missing Anti-Pattern — No state-change guard
# ❌ Agent-emitted UpdateOrder/__init__.py — emits on every PUT, including no-op updates
def main(req: func.HttpRequest,
orderDoc: func.DocumentList,
orderOut: func.Out[func.Document],
eventOut: func.Out[str]) -> func.HttpResponse:
order = dict(orderDoc[0])
body = req.get_json()
new_status = body["status"]
order["status"] = new_status
order["updatedAt"] = datetime.utcnow().isoformat()
orderOut.set(func.Document.from_dict(order))
# Fires even if body["status"] == order["status"] before the assignment
event = {"orderId": order["orderId"], "status": new_status, "updatedAt": order["updatedAt"]}
eventOut.set(json.dumps(event))
return func.HttpResponse(json.dumps(order), mimetype="application/json")
Downstream consumers receive duplicate OrderStatusChanged events for unchanged status. Idempotency on the consumer (per pattern-change-feed-materialized-views) absorbs some of this, but at the cost of storage, log noise, and paid egress on the Event Grid topic.
Missing Anti-Pattern — Event emission before Cosmos commit
# ❌ Agent-emitted UpdateOrder/__init__.py — event fires before Cosmos write observed
def main(req, orderDoc, orderOut, eventOut):
order = dict(orderDoc[0])
new_status = req.get_json()["status"]
# Event emitted first — if the function errors before orderOut.set, the event is already out
eventOut.set(json.dumps({"orderId": order["orderId"], "status": new_status, "updatedAt": now}))
order["status"] = new_status
orderOut.set(func.Document.from_dict(order))
return ...
In the Azure Functions v1 programming model, Out[T].set is buffered until the function returns successfully — but agents writing this pattern commonly move on to SDK-direct calls where the semantics flip. The rule should teach the general principle: never emit an event for a state the durable store has not yet confirmed.
Correct Pattern
# ✅ Reference-quality (P02/R05, AK Python) — guard + post-commit ordering
def main(req: func.HttpRequest,
orderDocIn: func.DocumentList,
orderDocOut: func.Out[func.Document],
outputEvent: func.Out[str]) -> func.HttpResponse:
order = dict(orderDocIn[0])
previous_status = order["status"]
body = req.get_json()
new_status = body.get("status", previous_status)
if new_status != previous_status:
allowed = VALID_TRANSITIONS.get(previous_status, set())
if new_status not in allowed:
return func.HttpResponse(..., status_code=422)
order["status"] = new_status
order["updatedAt"] = datetime.now(timezone.utc).isoformat()
orderDocOut.set(func.Document.from_dict(order)) # Cosmos commit first
# Only publish when status actually changed
if new_status != previous_status:
event = {
"id": str(uuid.uuid4()),
"subject": f"orders/{order['orderId']}",
"eventType": "OrderStatusChanged",
"eventTime": order["updatedAt"],
"dataVersion": "1.0",
"data": {
"orderId": order["orderId"],
"status": new_status,
"updatedAt": order["updatedAt"],
},
}
outputEvent.set(json.dumps(event))
return func.HttpResponse(json.dumps(order), mimetype="application/json")
For SDK-direct (non-binding) code paths, the discipline is: await container.replace_item(...) first, capture the returned _etag / _ts, then publish. If the replace raises, no event is emitted. For stronger guarantees, emit from the Cosmos DB Change Feed with an idempotency key — but that is a separate rule (pattern-change-feed-materialized-views) and not always needed.
Runtime Validation
Phase 4 Category #9 Design Patterns (AK 55.7%, Control 48.9%, Δ +6.8 pp) is dominated by this single sub-check. Per docs/phase4-runtime-validation.md:
| Run |
Arm |
Guard present? |
Emit ordering |
Notes |
| P02/R05 |
AK |
Yes |
post-commit |
Reference pattern |
| P06/R01 |
Ctrl |
Yes |
post-commit |
Non-AK profile also gets it right |
| P02/R04 |
AK |
Yes (early-return) |
post-commit |
Returns 200 without emit when unchanged |
| 30/80 runs (pooled) |
mixed |
No |
— |
Emits on every PUT |
| 5/80 runs (pooled) |
mixed |
— |
pre-commit |
Event-then-write ordering |
Emulator CRUD tests are not able to flag this — a test that issues a single PUT with a changed status passes regardless of whether the guard exists. The bug only manifests under idempotent-client retries or clients that PUT unchanged state.
Rule Contradiction Scan
| Rule |
Interaction |
pattern-change-feed-materialized-views |
Complementary — consumer-side idempotency handles what the producer-side guard doesn't. New rule cites it as the "for stronger guarantees" fallback. |
pattern-ai-grounding-access (new 2026-04-20) |
No overlap. |
sdk-etag-concurrency |
Complementary — etag guards concurrent writers; this rule guards emission. Pair them for a full read-modify-write-emit discipline. |
Sibling draft sdk-azure-functions-cosmos-bindings |
No overlap — that rule covers binding config; this one covers code flow. |
monitoring-ru-consumption (updated 2026-04-20) |
No contradiction — encourages capturing requestCharge per write; complements the "commit then emit" ordering. |
Documentation Cross-reference
Recommended cosmosdb-agent-kit Fix
Create a new rule file skills/cosmosdb-best-practices/rules/pattern-event-after-commit.md and add a section to AGENTS.md under §9 Design Patterns (e.g. §9.5, following the newly-added §9.4 pattern-ai-grounding-access). Suggested skeleton:
---
title: Emit state-change events only after the Cosmos DB commit and only when state actually changed
impact: HIGH
impactDescription: prevents duplicate events on no-op updates, prevents orphan events when the Cosmos write fails
tags: pattern, events, messaging, event-grid, service-bus, dual-write, outbox, idempotency
---
## Emit state-change events only after the Cosmos DB commit and only when state actually changed
When a write path both persists a document to Cosmos DB and publishes a state-change
event (Event Grid, Service Bus, Kafka, etc.), two rules apply:
1. **Guard the emission on an explicit state-change comparison.** Read the prior value
of the field that defines the event (e.g., `order.status`) before mutating, and only
emit when the new value differs. Idempotent clients, retried requests, or PUT-with-
full-document endpoints routinely produce no-op updates; emitting on every update
produces duplicate-event storms.
2. **Order the emission after the Cosmos DB commit.** In SDK-direct code, `await` the
write and only then publish. In Azure Functions with output bindings, this is the
default if you set the Cosmos binding before the event binding in function code,
but be explicit about the dependency.
For stronger guarantees where a partial failure cannot lose events, emit from the
Cosmos DB Change Feed instead of the synchronous write path — see
`pattern-change-feed-materialized-views` for the consumer-idempotency companion rule.
[Incorrect / Correct examples — Python (Functions v1 + SDK-direct), JS (Functions v4),
C# (isolated worker) — as drafted above.]
See also: `pattern-change-feed-materialized-views` (CF-based outbox), `sdk-etag-concurrency`
(concurrent-write guard), `monitoring-ru-consumption` (RU on the write that precedes the
event).
Reference:
- https://learn.microsoft.com/azure/azure-functions/functions-bindings-cosmosdb-v2-output
- https://learn.microsoft.com/azure/azure-functions/functions-bindings-event-grid-output
- https://learn.microsoft.com/azure/cosmos-db/nosql/change-feed-design-patterns
#NNN — Rubric Gap Analysis —
pattern-event-after-commitnew rule: Event Grid emission must be guarded by state change and succeed the Cosmos DB commitpattern-event-after-commitpattern-*)enhancement,SCOPE,agent-kitPrior-art check
No matching enhancement in PRs #38–commit 53791f0 (2026-04-20). Rule inventory re-synced 2026-04-20 → 94 rules, including the brand-new
pattern-ai-grounding-accesswhich addresses a different concern (LLM tool-use point reads). Closest existing Design Patterns rules:pattern-change-feed-materialized-views— updates via Change Feed / idempotency on the consumer side; no guidance on emission from the producer.pattern-efficient-ranking,pattern-service-layer-relationships— unrelated.pattern-ai-grounding-access(NEW 2026-04-20) — LLM-grounding point-read pattern; no overlap.No rule covers the producer-side obligation to (a) gate event emission on actual state change and (b) order the emit strictly after the Cosmos DB write commits.
Summary
Scenarios that pair a Cosmos DB write with an Event Grid (or any messaging system) publish suffer two canonical bugs: emitting an event when nothing actually changed, and emitting an event before the Cosmos write has committed. Both produce dual-write anomalies observable downstream — duplicate event storms and orphan events for rows that never persisted. Existing skill text does not name either failure mode. This rule codifies the producer-side discipline: guard every emit with an explicit old-vs-new comparison on the field(s) that define the event, and ensure the emit is ordered after the Cosmos DB commit.
Rubric Gap Analysis
Discovered during Phase 3 (Finding H4) and confirmed in Phase 4 Category #9 Design Patterns rollup over 80 L5 snapshots:
UpdateOrderemits Event Grid event without comparing old vs newstatusAK profiles do slightly better on the guard (66% vs 60% Control) but 34% of AK runs still miss it — the skill's existing rules do not teach this discipline, so even the agent kit leaves it to model priors. This drives the Phase 4
#9 Design Patternsdelta of only +6.8 pp AK over Control, a category where a named rule should produce a much larger gap.Cross-profile, cross-arm, cross-language: misses appear in Python, JavaScript, and C# runs; no single profile is responsible.
Evidence
Existing Rule Coverage (quoted verbatim)
From
pattern-change-feed-materialized-views.md(enhanced via PR #39):Guidance is entirely consumer-side. Producer-side emission discipline is not addressed.
From
pattern-efficient-ranking.md:Unrelated.
From
pattern-service-layer-relationships.md:Unrelated.
From
pattern-ai-grounding-access.md(NEW 2026-04-20):Unrelated; no scenario overlap.
Missing Anti-Pattern — No state-change guard
Downstream consumers receive duplicate
OrderStatusChangedevents for unchanged status. Idempotency on the consumer (perpattern-change-feed-materialized-views) absorbs some of this, but at the cost of storage, log noise, and paid egress on the Event Grid topic.Missing Anti-Pattern — Event emission before Cosmos commit
In the Azure Functions v1 programming model,
Out[T].setis buffered until the function returns successfully — but agents writing this pattern commonly move on to SDK-direct calls where the semantics flip. The rule should teach the general principle: never emit an event for a state the durable store has not yet confirmed.Correct Pattern
For SDK-direct (non-binding) code paths, the discipline is:
await container.replace_item(...)first, capture the returned_etag/_ts, then publish. If the replace raises, no event is emitted. For stronger guarantees, emit from the Cosmos DB Change Feed with an idempotency key — but that is a separate rule (pattern-change-feed-materialized-views) and not always needed.Runtime Validation
Phase 4 Category #9 Design Patterns (AK 55.7%, Control 48.9%, Δ +6.8 pp) is dominated by this single sub-check. Per docs/phase4-runtime-validation.md:
Emulator CRUD tests are not able to flag this — a test that issues a single PUT with a changed status passes regardless of whether the guard exists. The bug only manifests under idempotent-client retries or clients that PUT unchanged state.
Rule Contradiction Scan
pattern-change-feed-materialized-viewspattern-ai-grounding-access(new 2026-04-20)sdk-etag-concurrencysdk-azure-functions-cosmos-bindingsmonitoring-ru-consumption(updated 2026-04-20)requestChargeper write; complements the "commit then emit" ordering.Documentation Cross-reference
Out[T].setsemantics and failure behaviour: https://learn.microsoft.com/azure/azure-functions/functions-bindings-cosmosdb-v2-outputRecommended cosmosdb-agent-kit Fix
Create a new rule file
skills/cosmosdb-best-practices/rules/pattern-event-after-commit.mdand add a section toAGENTS.mdunder §9 Design Patterns (e.g. §9.5, following the newly-added §9.4pattern-ai-grounding-access). Suggested skeleton: