Skip to content

fix(adt): skip redundant mutation gate after lock to keep stateful session alive#125

Open
dme007 wants to merge 1 commit into
oisee:mainfrom
dme007:fix/skip-redundant-mutation-gate-after-lock
Open

fix(adt): skip redundant mutation gate after lock to keep stateful session alive#125
dme007 wants to merge 1 commit into
oisee:mainfrom
dme007:fix/skip-redundant-mutation-gate-after-lock

Conversation

@dme007
Copy link
Copy Markdown

@dme007 dme007 commented Apr 25, 2026

Summary

Sister bug to commit 8cb45a5 (SyntaxCheck before Lock). When
SAP_ALLOWED_PACKAGES is configured, the inner mutation gate inside
UpdateSource / UpdateClassInclude / DeleteObject /
CreateTestInclude resolves the package via getObjectPackage
which issues a STATELESS informationsystem/search hop. When that
hop is interleaved between a stateful LockObject and the stateful
write, SAP's ICM retires the stateful session and the lock handle is
invalidated on the write with HTTP 423
ExceptionResourceInvalidLockHandle.

Captured live against an on-prem S/4HANA via SAP Cloud Connector +
Principal Propagation. The trace below is from the working session
that finally produced the diagnosis. Full go test ./... is green.

What & why

pkg/adt/workflows_edit.go:153 correctly runs checkMutation once
before LockObject (line 363). The bug is on the inside: after the
lock is acquired, EditSourceWithOptions calls UpdateSource
(pkg/adt/crud.go:138) or UpdateClassInclude (crud.go:1027),
and both of those call checkMutation AGAIN. Under
AllowedPackages, the second invocation has to resolve the package
from ObjectURL, which fans out to SearchObject — and that hop
rides the transport's stateless default.

Reproduced on EditSourceWithOptions via live HTTP trace:

line 1167  >>> POST /sap/bc/adt/oo/classes/zcl_my_class?_action=LOCK
               X-Sap-Adt-Sessiontype: stateful
line 1172  <<< 200 OK  LOCK_HANDLE 7C738FAC8D510A3F028F654D03EB9FC6B87EF4A3

line 1192  >>> GET  /sap/bc/adt/repository/informationsystem/search
                   ?operation=quickSearch&query=ZCL_MY_CLASS
               X-Sap-Adt-Sessiontype: stateless    ← kills stateful session
line 1196  <<< 200 OK   (returns packageName=Z_MY_PACKAGE)

line 1216  >>> PUT  /sap/bc/adt/oo/classes/zcl_my_class/source/main
                   ?lockHandle=7C738FAC8D510A3F028F654D03EB9FC6B87EF4A3
               X-Sap-Adt-Sessiontype: stateful
               X-Csrf-Token: dL_42DiJqxXMs6xF3dq_0A==
           <<< 400 Session Timed Out
               Sap-Err-Id: ICMENOSESSION
               Set-Cookie: sap-contextid=…  (×8 zombie cookies)

line 1402  >>> PUT  …/source/main           (retry after CSRF refetch)
               X-Sap-Adt-Sessiontype: stateful
               X-Csrf-Token: 3hnk0iwUrYnH4QF-wHluTw==   ← fresh token
           <<< 423 Locked
               ExceptionResourceInvalidLockHandle
               "Ressource CLASS ZCL_MY_CLASS ist nicht gesperrt
                (ungültiges Sperr-Handle: 7C738FAC8D510A3F028F654D03EB9FC6B87EF4A3)"

The lock handle is bound to the stateful session that the search hop
killed. Refetching CSRF gets a new token but the lock handle is gone,
and the retry hits 423.

Fix

Unexported context-key mutationGateSkipKey plus two helpers in
pkg/adt/mutation_gate.go:

func withMutationGateAlreadyRan(ctx context.Context) context.Context
func mutationGateAlreadyRan(ctx context.Context) bool

checkMutation short-circuits immediately when the context is marked.
Outer workflows that have already run the gate before LockObject
mark the context and the inner mutators skip their redundant gate.

This keeps the precise package check intact (it still runs once, at
the outer entry point) while removing the second resolution from
between Lock and Write. The Stateful flags from 22517d4 remain
necessary; ordering matters too, and any stateless hop in the middle
breaks the session.

Activate is intentionally not touched — it runs only checkSafety
(no getObjectPackage), so it never injects a SearchObject hop.
WriteMessageClassTexts / WriteDataElementLabels / UI5 mutators
are leaf APIs reachable only from external lock-acquiring callers
(MCP tool handlers); deciding when to skip is a caller concern.

Workflows updated to mark the context

  • EditSourceWithOptions — the smoking gun from the trace
  • WriteProgram, WriteClass — same Lock→UpdateSource pattern
  • CreateAndActivateProgram — Create→Lock→UpdateSource pattern
  • CreateClassWithTests — Create→Lock→UpdateSource→CreateTestInclude→UpdateClassInclude
  • CreateFromFile — added an upfront unified gate (was only checkSafety)
  • UpdateFromFile — added an upfront unified gate (was only checkSafety)
  • RenameObject — already gates twice; mark after both
  • ExecuteABAP — added an upfront unified gate ($TMP); avoids the bug for the temp-program Lock→PUT path under any AllowedPackages whitelist that excludes $TMP it now fails up front instead of mid-write
  • writeSourceUpdate (INTF, DDLS/BDEF/SRVD branches) — direct Lock→UpdateSource
  • writeSourceCreate (INTF, DDLS/BDEF/SRVD branches) — direct Create→Lock→UpdateSource
  • writeClassMethodUpdate — direct Lock→UpdateSource

WriteSource itself stays as-is: it intentionally delegates package
resolution to its inner workflow methods, which now do the gate-and-mark.

Files changed

pkg/adt/mutation_gate.go            — mutationGateSkipKey + helpers,
                                      checkMutation short-circuit
pkg/adt/mutation_gate_skip_test.go  — three new tests (see below)
pkg/adt/workflows_edit.go           — mark ctx after EditSourceWithOptions gate
pkg/adt/workflows.go                — mark ctx after WriteProgram / WriteClass /
                                      CreateAndActivateProgram / CreateClassWithTests
pkg/adt/workflows_deploy.go         — gate-and-mark in CreateFromFile / UpdateFromFile
                                      (replaces the prior bare checkSafety)
pkg/adt/workflows_execute.go        — gate-and-mark in ExecuteABAP
pkg/adt/workflows_fileio.go         — mark ctx after RenameObject's two gates
pkg/adt/workflows_source.go         — gate-and-mark in writeSourceUpdate (INTF,
                                      DDLS/BDEF/SRVD), writeSourceCreate (INTF,
                                      DDLS/BDEF/SRVD), writeClassMethodUpdate

Tests

TestMutationGateSkip_EditSourceNoSearchBetweenLockAndPut — full
EditSourceWithOptions happy path under WithAllowedPackages("$TMP"),
asserts no informationsystem/search request appears between the
Lock POST and the source PUT. Verified to FAIL when the
withMutationGateAlreadyRan(ctx) line in EditSourceWithOptions is
removed (reproduces the exact bug pattern from the trace).

TestMutationGateSkip_FlagSkipsInnerCheck — unit-tests the skip
plumbing: a deliberately invalid MutationContext (no ObjectURL,
no Package) that would normally fail closed under AllowedPackages
must short-circuit when the context is marked.

TestMutationGateSkip_FlagDoesNotLeakAcrossContexts — flag is scoped
to its derivation chain only; sibling contexts derived from the same
parent stay clean.

go test ./pkg/adt/... and go test ./... both green.

Related

Non-goals

  • Public-API leaf mutators (UpdateSource, UpdateClassInclude,
    DeleteObject, CreateTestInclude, WriteMessageClassTexts,
    WriteDataElementLabels) keep their inner gate. Callers that
    acquire their own lock and bypass the workflows still get
    policy enforcement; if they hit the same session-affinity bug,
    the fix is the same — gate-and-mark before the lock.

…ssion alive

EditSourceWithOptions and the other write workflows ran the unified
mutation gate up front, then called LockObject, then called UpdateSource
(or UpdateClassInclude / DeleteObject / CreateTestInclude) — and those
inner mutators ran the gate AGAIN. Under SAP_ALLOWED_PACKAGES, the
inner gate's getObjectPackage path issues a STATELESS hop:

    GET /sap/bc/adt/repository/informationsystem/search?…&query=…

When that stateless hop is interleaved between a stateful Lock and the
stateful PUT/DELETE/POST, SAP's ICM retires the stateful session
server-side (Sap-Err-Id: ICMENOSESSION). The lock handle bound to the
dead session is then rejected on the write with HTTP 423
ExceptionResourceInvalidLockHandle. Reproduced live in
/tmp/vsp-mcp-trace-dsd201.log lines 1167 (LOCK) → 1192
(quickSearch) → 1216 (failing PUT).

This is the exact same bug class as commit 8cb45a5 (SyntaxCheck before
Lock), just at a different call site that the prior fix did not cover.
The Stateful flags on the per-request RequestOptions are necessary but
not sufficient — call ordering matters too, and any stateless hop
between Lock and Write breaks the session.

Fix introduces an unexported context-key (mutationGateSkipKey) plus
two helpers — withMutationGateAlreadyRan and mutationGateAlreadyRan —
in pkg/adt/mutation_gate.go. checkMutation short-circuits when the
flag is set. Outer workflows that already ran the gate before
LockObject mark the context so the inner mutators skip their
redundant gate.

Workflows updated to mark the context after their own gate call:

  EditSourceWithOptions    workflows_edit.go
  WriteProgram             workflows.go
  WriteClass               workflows.go
  CreateAndActivateProgram workflows.go
  CreateClassWithTests     workflows.go
  CreateFromFile           workflows_deploy.go    (also adds full gate)
  UpdateFromFile           workflows_deploy.go    (also adds full gate)
  RenameObject             workflows_fileio.go
  ExecuteABAP              workflows_execute.go   (also adds full gate)
  writeSourceUpdate (INTF, DDLS/BDEF/SRVD)        workflows_source.go
  writeSourceCreate (INTF, DDLS/BDEF/SRVD)        workflows_source.go
  writeClassMethodUpdate                          workflows_source.go

Activate is not touched: it does not resolve the package and never
issues a SearchObject hop, so the inner gate already wouldn't break
the session there. WriteMessageClassTexts / WriteDataElementLabels
and the UI5 mutators are left alone — their inner gate is reachable
only from external lock-acquiring callers (MCP tool handlers), so
deciding when to skip is a caller concern.

Tests:

  TestMutationGateSkip_EditSourceNoSearchBetweenLockAndPut
    — full happy-path EditSourceWithOptions under AllowedPackages,
      asserts no informationsystem/search request appears between
      the Lock POST and the source PUT.
  TestMutationGateSkip_FlagSkipsInnerCheck
    — unit-tests the skip plumbing: a deliberately-invalid
      MutationContext (no ObjectURL, no Package) that would normally
      fail closed under AllowedPackages must short-circuit when the
      context is marked.
  TestMutationGateSkip_FlagDoesNotLeakAcrossContexts
    — the flag is scoped to its derived context only; sibling
      contexts derived from the same parent stay clean.

go test ./... is green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant