Skip to content

fix(adt): deploy session ordering + correct MODIFICATION_SUPPORT handling#108

Open
dme007 wants to merge 4 commits into
oisee:mainfrom
dme007:fix/deploy-session-affinity
Open

fix(adt): deploy session ordering + correct MODIFICATION_SUPPORT handling#108
dme007 wants to merge 4 commits into
oisee:mainfrom
dme007:fix/deploy-session-affinity

Conversation

@dme007
Copy link
Copy Markdown

@dme007 dme007 commented Apr 18, 2026

Summary

Four independent but related fixes to the ADT layer. All surfaced while
verifying vsp deploy on SAP Business Application Studio destinations
(Cloud Connector + Principal Propagation). Live-tested against an on-prem
S/4HANA (DEV 2021 FPS05) for CLAS and INTF, both $TMP and transportable
customer packages. Full pkg/adt suite stays green.

  1. SyntaxCheck before Lock in CreateFromFile / UpdateFromFile.
  2. Drop the MODIFICATION_SUPPORT="NoModification" hard-fail at LOCK.
  3. Preserve X-CSRF-Token and X-sap-adt-sessiontype on redirects.
  4. Replace the cookie jar on ICMENOSESSION recovery.

Plus one opt-in diagnostic: VSP_HTTP_TRACE=1 dumps raw HTTP requests
and responses to stderr with Authorization / Cookie values redacted. It
made the root-cause analysis of (2) and (4) possible.

1. SyntaxCheck before Lock

SyntaxCheck carried the default stateless session-type; running it
after LockObject interleaves a stateless hop between the stateful
lock and write. SAP's ICM retires the stateful context on that hop
(Sap-Err-Id: ICMENOSESSION on the PUT), and the ensuing CSRF refresh
fails because the session that held the token is already gone.

Moving SyntaxCheck before LockObject keeps Lock → UpdateSource → Unlock → Activate as a single uninterrupted stateful block. The
per-request Stateful: true flags from 22517d4 are necessary but not
sufficient on their own — call ordering matters too.

2. MODIFICATION_SUPPORT="NoModification" is not "read-only"

Commit 22517d4 hard-failed LockObject when the response carried
MODIFICATION_SUPPORT=NoModification, reading it as "object is
read-only via ADT". That is a misreading of the SAP-standard interface
IF_ADT_LOCK_RESULT, which defines four values:

XML value Constant Documented meaning
ModifcationAssistant (sic) CO_MOD_SUPPORT_MODASS SAP/partner object, Modification Assistant on
ModificationsLoggedOnly CO_MOD_SUPPORT_LOGGED_ONLY SAP/partner object, changes only logged
NoModification CO_MOD_SUPPORT_NOT_NEEDED "Modification support is not needed because object is not in SAP/partner namespace in a customer system."
`` (empty) CO_MOD_SUPPORT_NOT_SPECIFIED not specified

The constant is NOT_NEEDED, not NOT_PERMITTED. Customer Z*/Y*
objects routinely return this value and the subsequent PUT succeeds —
confirmed live with VSP_HTTP_TRACE=1:

<LOCK_HANDLE>…</LOCK_HANDLE>
<CORRNR>WSDK920768</CORRNR> <CORRUSER>X60000469</CORRUSER>
<MODIFICATION_SUPPORT>NoModification</MODIFICATION_SUPPORT>
→ PUT …/source/main: HTTP 200, Etag returned; Activate: activationExecuted="true"

CORRNR / CORRUSER populated on the LOCK response means SAP accepted
the lock. The field is tracking metadata, not an authorisation verdict.

The "confusing HTTP 423 a few seconds later" the guard was meant to
pre-empt is fully explained by the session-affinity work in 27f4d7c +
the class-include completion in 22517d4 itself. The guard only ever
fires as a false positive on destinations whose SAP system populates
the flag — blocking legitimate customer edits.

LockObject now returns the parsed result verbatim for any value.
Tests move from the asymmetric _RejectsNoModification pair to a
table-driven _PassesThroughModificationSupport, plus the retained
read-lock variant.

3. CheckRedirect preserves ADT headers

Go's http.Client strips Authorization / WWW-Authenticate / Cookie on
cross-origin redirects (RFC 7235). The existing CheckRedirect
restores Authorization for #90. X-CSRF-Token and
X-sap-adt-sessiontype are not on Go's sensitive-header list and are
therefore forwarded by default — but if either silently vanishes
across a redirect (proxy, future library change, middleware), the
second hop lands stateless with a missing CSRF token and the in-flight
lock is invalidated. Re-set both explicitly from via[0], mirroring
the existing Authorization logic. Documented as defensive in the
source comment.

4. Replace the cookie jar on ICMENOSESSION recovery

Long-running processes (MCP server) hit HTTP 400 ICMENOSESSION on
every request after the first successful
Lock → Write → Unlock → Activate cycle. Captured trace:

POST …/activation?method=activate              → 200  (stateless; ends context)
POST …/repository/nodestructure                → 400 ICMENOSESSION
    + 6× Set-Cookie: sap-contextid=…
HEAD …/core/discovery                          → 400 ICMENOSESSION
    + 6× Set-Cookie: sap-contextid=…
…                                               (stuck)

Root cause: the stateless Activate closes SAP's stateful context
server-side, but sap-contextid cookies from the stateful phase stay
in Go's cookie jar — sometimes on multiple paths (/, /sap/,
/sap/bc/, /sap/bc/adt/). Every retry re-sends a matching dead
identifier, including the HEAD /core/discovery refetch that
IsSessionExpired recovery relies on. Short-lived CLI subcommands
escape because each spawns a fresh jar.

A first iteration of this fix expired cookies with
SetCookies(baseURL, {MaxAge: -1, Path: "/"}). It did not work in
practice: http.CookieJar keys entries by (name, domain, path) and
does not expose the stored path — an expire for Path="/" leaves
entries on deeper paths untouched, and that is exactly where SAP
stores the contextid.

Final fix replaces the entire cookie jar with a fresh one before the
CSRF refetch. User-provided cookies (config.Cookies, populated by
browser-auth / SAML / cookie-file flows) are attached per request via
addCookies() and survive untouched; only dynamically
server-deposited entries from the dead session are dropped —
the desired outcome. Transport gains a cached jar http.CookieJar
reference from the underlying *http.Client; a type-assertion on
HTTPDoer leaves the field nil for test mocks and the helper no-ops
there.

Files changed

pkg/adt/workflows_deploy.go     — SyntaxCheck before Lock
pkg/adt/crud.go                 — drop NoModification guard, add interface doc comment
pkg/adt/crud_reconcile_test.go  — _PassesThroughModificationSupport (table-driven)
pkg/adt/config.go               — CheckRedirect preserves X-CSRF-Token +
                                  X-sap-adt-sessiontype
pkg/adt/config_test.go          — _CheckRedirectPreservesADTHeaders
                                  _CheckRedirectHonoursLimit
pkg/adt/http.go                 — VSP_HTTP_TRACE env gate + trace helpers hooked
                                  into Request / retryRequest / fetchCSRFToken;
                                  Transport.jar field; clearSAPSessionCookies
                                  (jar-replace) helper; ICMENOSESSION recovery
                                  call site
pkg/adt/http_test.go            — _ClearSAPSessionCookies_ReplacesJar
                                  _ClearSAPSessionCookies_NonHTTPClientIsSafe

Related issues

Dominik Miescher and others added 4 commits April 17, 2026 09:32
CreateFromFile and UpdateFromFile ran SyntaxCheck *after* LockObject.
SyntaxCheck has no Stateful flag on its RequestOptions, so it rode the
transport's stateless default. The outbound sequence therefore alternated
session-type mid-stream:

    POST …/oo/classes/<name>?_action=LOCK           X-sap-adt-sessiontype: stateful
    POST …/checkruns?reporters=abapCheckRun         X-sap-adt-sessiontype: stateless  ← kills stateful session
    PUT  …/oo/classes/<name>/source/main            X-sap-adt-sessiontype: stateful

SAP's ICM retired the stateful session after the stateless hop
(Sap-Err-Id: ICMENOSESSION on the PUT, HTTP 400 "Session Timed Out"),
after which the CSRF-refresh path failed because the session that held
the token was already gone — producing the "no CSRF token in response
(HTTP 400)" error path users hit on live deploys.

Running SyntaxCheck before LockObject keeps the Lock → UpdateSource →
Unlock → Activate block as a single uninterrupted stateful sequence.
The Stateful flags 22517d4 added on individual calls are necessary but
not sufficient; call ordering matters too.

This completes the oisee#88 / oisee#92 / oisee#98 lock-handle class on Cloud-Connector
destinations with a single-pod Envoy proxy, where the stateless hop
would otherwise migrate the connection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…only

22517d4 hard-failed LockObject when a successful LOCK response came back
with MODIFICATION_SUPPORT="NoModification", interpreting that value as
"the object is read-only via ADT for this user/system".

That reading is wrong. SAP's standard interface IF_ADT_LOCK_RESULT
defines four values for this field:

  "ModifcationAssistant"    CO_MOD_SUPPORT_MODASS        — SAP/partner object,
                                                          Modification Assistant on
  "ModificationsLoggedOnly" CO_MOD_SUPPORT_LOGGED_ONLY   — SAP/partner object,
                                                          changes only logged
  "NoModification"          CO_MOD_SUPPORT_NOT_NEEDED    — "Modification support is
                                                          not needed because object
                                                          is not in SAP/partner
                                                          namespace in a customer
                                                          system" (customer Z*/Y*
                                                          objects — tracking not
                                                          needed, edit is fine)
  ""                        CO_MOD_SUPPORT_NOT_SPECIFIED — not specified

The constant is NOT_NEEDED, not NOT_PERMITTED. Customer-namespace edits
normally return this value, and the PUT that follows succeeds — confirmed
live with VSP_HTTP_TRACE on on-prem S/4HANA deployments of ZCL_* classes
to both $TMP and transportable customer packages:

    <LOCK_HANDLE>…</LOCK_HANDLE>
    <CORRNR>WSDK920768</CORRNR>
    <CORRUSER>X60000469</CORRUSER>
    <CORRTEXT>… CR description …</CORRTEXT>
    <IS_LOCAL/>
    <MODIFICATION_SUPPORT>NoModification</MODIFICATION_SUPPORT>   ← metadata, not reject
    <SCOPE_MESSAGES/>
    → subsequent PUT …/source/main returns HTTP 200, class activates

CORRNR / CORRUSER populated on the LOCK response means SAP evaluated the
user's edit rights and accepted the lock; the field is advisory metadata
about change tracking, not an authorisation verdict.

The "confusing HTTP 423 InvalidLockHandle a few seconds later" that the
guard was introduced to pre-empt is fully explained by the session-
affinity fixes in 27f4d7c plus 22517d4's class-include completion and the
call-ordering fix in the preceding commit of this series. The
NoModification guard is orthogonal to that bug class and only ever fires
as a false positive on normal customer code.

LockObject now returns the parsed result verbatim. The regression-test
pair TestLockObject_RejectsNoModification /
TestLockObject_AllowsNoModificationOnReadLock is replaced with
TestLockObject_PassesThroughModificationSupport — a table-driven test
that pins pass-through across all four documented SAP values — plus a
retained read-lock variant to keep the READ access-mode path explicitly
covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related defensive improvements uncovered while root-causing the
MODIFICATION_SUPPORT issue on Cloud-Connector destinations.

1. CheckRedirect now explicitly re-sets X-CSRF-Token and
   X-sap-adt-sessiontype from the initial request (via[0]) alongside
   the existing Authorization restoration.

   Go's http.Client strips Authorization / WWW-Authenticate / Cookie /
   Cookie2 on cross-origin redirects per RFC 7235 §4.2; custom headers
   are forwarded by default. We re-set the two ADT-critical custom
   headers explicitly anyway — partly as defence against any Go version
   or middleware that decides to strip them on its own, partly as
   documentation-in-code that they are load-bearing for the
   lock → write → unlock sequence. Without X-sap-adt-sessiontype the
   second hop lands stateless and SAP rejects the lock handle with
   HTTP 423; without X-CSRF-Token the mutation is rejected as CSRF-
   violating. Two regression tests pin the behaviour.

2. VSP_HTTP_TRACE env var gates raw HTTP request/response dumps to
   stderr from Transport.Request, Transport.retryRequest, and
   Transport.fetchCSRFToken. Method, URL, selected headers, and body
   (truncated at 4 KiB) are written; Authorization and individual
   Cookie values are redacted so the dump is safe to paste into a bug
   report. Default-off, opt-in per invocation. This was the tool that
   made the preceding NoModification root-cause analysis tractable and
   that captured the "Session Timed Out" signature behind the deploy-
   workflow ordering fix in the first commit of this series.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Long-running vsp MCP-server processes hit HTTP 400 "Session Timed Out"
(Sap-Err-Id: ICMENOSESSION) on every request after the first successful
Lock → Write → Unlock → Activate sequence on SAP BAS destinations with
Cloud Connector + Principal Propagation.

Root cause: the stateless POST /sap/bc/adt/activation at the end of the
deploy closes SAP's stateful context server-side, but the sap-contextid
cookies the server had set during the stateful phase remain in Go's
cookie jar — sometimes on several paths (/, /sap/, /sap/bc/,
/sap/bc/adt/). Every subsequent outgoing request re-sends a matching
dead identifier, and SAP's ICM replies ICMENOSESSION — including on
the HEAD /core/discovery refetch that the IsSessionExpired recovery
path issues to open a fresh session. The recovery therefore loops:
same stale cookie mix, same ICMENOSESSION, no forward progress.

Trace captured via VSP_HTTP_TRACE=1 during an MCP session that
deployed an interface followed by a second tool call:

    POST  …/oo/interfaces/ZIF_…?_action=LOCK            → 200
    PUT   …/oo/interfaces/ZIF_…/source/main             → 200
    POST  …/oo/interfaces/ZIF_…?_action=UNLOCK          → 200
    POST  /sap/bc/adt/activation?method=activate        → 200  (stateless; ends context)
    POST  /sap/bc/adt/repository/nodestructure          → 400 ICMENOSESSION
        + 6× Set-Cookie: sap-contextid= …
    HEAD  /sap/bc/adt/core/discovery                    → 400 ICMENOSESSION
        + 6× Set-Cookie: sap-contextid= …
    …                                                   (stuck)

Short-lived CLI subcommands don't see this — each spawns a fresh
process with a brand-new cookie jar.

Fix: in the IsSessionExpired branch of Transport.Request, replace the
entire cookie jar with a fresh cookiejar.New before the CSRF refetch.

Targeted expire with SetCookies(url, {MaxAge: -1}) was the first
approach attempted but proved insufficient: Go's http.CookieJar keys
entries by (name, domain, path) and does not expose the stored path
through its public interface. A delete cookie for Path="/" leaves
entries on /sap/, /sap/bc/, or /sap/bc/adt/ untouched, and that is
exactly where SAP's ICM stores sap-contextid. Live re-capture after
the targeted approach landed still showed the same ICMENOSESSION
loop. Jar replacement drops every variant in one step regardless of
the path scope SAP used.

User-provided cookies (config.Cookies — populated by basic auth with
explicit cookie injection, browser-auth SSO, cookie-file, or
cookie-string flows) are attached per request via addCookies() on
every outbound Request, so swapping the jar does not drop the
caller's SSO / cookie-file state. Only cookies that the server had
dynamically deposited during the dead session are removed, which is
exactly the desired behaviour.

Transport now carries a jar http.CookieJar cached reference extracted
from the underlying *http.Client; the helper type-asserts
t.httpClient.(*http.Client) to swap both the client's Jar field and
the cached reference in lock-step. NewTransportWithClient fed a mock
HTTPDoer leaves both nil, and the helper no-ops there.

Regression coverage:

  TestClearSAPSessionCookies_ReplacesJar        — seeds the jar with
      sap-contextid / SAP_SESSIONID_* on /, /sap/, /sap/bc/adt/;
      asserts every SAP variant is gone post-swap and that
      *http.Client.Jar was updated in lock-step with Transport.jar.
  TestClearSAPSessionCookies_NonHTTPClientIsSafe — mock HTTPDoer path
      stays a no-op (cannot swap what we don't own).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
wusxo24 added a commit to wusxo24/vibing-steampunk that referenced this pull request Apr 22, 2026
- fix(adt): drop stale sap-contextid on ICMENOSESSION recovery
- feat(adt): preserve ADT headers on redirect + opt-in HTTP trace
- fix(adt): drop MODIFICATION_SUPPORT="NoModification" misread as read-only
- fix(adt): move SyntaxCheck before Lock in deploy workflows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@frd1201
Copy link
Copy Markdown

frd1201 commented Apr 27, 2026

Independent confirmation of the analysis from a different on-premise system.

Environment: S/4HANA on-prem (release D18, stack 001), customer namespace Z*, transportable package ZLO_PROMOTION_STAGING, transport task D18K..... (Aufgabe within order D18K923708).

Symptom (current main):
EditSource against ZCL_LO_PROMO_STAGING_BUILDERFailed to lock object: object … is not modifiable via ADT on this system (SAP returned modificationSupport="NoModification" during LOCK). Identical class is fully editable in Eclipse ADT in the same client with the same user — confirming the flag is informational, not authoritative, exactly as you describe.

Side-effect we hit before reading your PR: because the existing guard returns the error before the LOCK_HANDLE is surfaced to the caller, the SAP-side ENQUEUE entry that the LOCK request created stays put (SM12 shows M...... editing the class until cleared manually). Removing the guard makes this moot since LockObject returns the handle and the normal unlock path runs — but it's also a useful hint that the guard was misclassified: SAP did acquire a real lock as part of the request.

What we did locally before finding #108: built a configurable opt-out (SAP_IGNORE_NO_MODIFICATION_GUARD) plus a defensive UnlockObject on the guard's error path. Both work, but your PR is the better fix — the guard was based on a misreading of IF_ADT_LOCK_RESULT.CO_MOD_SUPPORT_NOT_NEEDED, and removing it fixes the bug at the root rather than papering over it. We're going to drop our local branch in favour of merging this PR.

Verified after applying #108 logic locally: Lock → SyntaxCheck → UpdateSource → Unlock → Activate succeeds end-to-end on the customer class. activationExecuted="true", no orphan SM12 entry. Same evidence shape as your trace excerpt.

A merge of this PR (along with #125's mutation-gate ordering work) would unblock all on-prem EditSource/WriteSource flows on systems that populate MODIFICATION_SUPPORT for customer-namespace objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants