@@ -142,6 +142,28 @@ Testing against the live discussions app (amplify-discussions-main-c39a5,
142142 Code must poll each nested changeset to terminal status before
143143 describing it.
144144
145+ 7 . ** Template format errors are also recoverable.** Amplify AppSync API
146+ stacks contain model sub-stacks that export names via ` Fn::Join ` with
147+ parameter references. With ` IncludeNestedStacks: true ` , CFN cannot
148+ resolve intrinsic functions at validation time — all exports appear as
149+ ` {{IntrinsicFunction://Fn::Join}} ` , triggering `"Template format
150+ error: duplicate Export names"`. Despite the failure, CFN populates
151+ Changes before the export validation step. The function was broadened
152+ from ` isEarlyValidationFailure ` to ` isRecoverableFailure ` to handle
153+ both EarlyValidation and Template format error failures. When a
154+ recoverable failure produces zero Changes, the stack is treated as
155+ incomplete (not clean).
156+
157+ 8 . ** Cascading IAM Policy changes from DeletionPolicy modifications.**
158+ When ` DeletionPolicy: Retain ` is added to a DynamoDB table, CFN
159+ flags every IAM Policy whose ` PolicyDocument ` references that table's
160+ attributes (e.g., ` TodoTable.Arn ` ) as a Dynamic ResourceAttribute
161+ re-evaluation. These appear as ` Modify ` changes on ` AWS::IAM::Policy `
162+ with ` Scope: ['Properties'] ` , ` ChangeSource: ResourceAttribute ` ,
163+ ` Evaluation: Dynamic ` , ` RequiresRecreation: Never ` , and
164+ ` CausingEntity: <TableLogicalId>.Arn ` . These are harmless and must
165+ be filtered alongside direct DeletionPolicy drift.
166+
145167### Comparison
146168
147169| Dimension | IncludeNestedStacks: true | Per-nested (Method B) |
@@ -209,8 +231,11 @@ patterns:
209231Any other StatusReason is a genuine error — skip that stack.
210232
211233``` typescript
212- function isEarlyValidationFailure(reason ? : string ): boolean {
213- return !! reason ?.includes (' EarlyValidation' );
234+ function isRecoverableFailure(reason ? : string ): boolean {
235+ if (! reason ) return false ;
236+ if (reason .includes (' EarlyValidation' )) return true ;
237+ if (reason .includes (' Template format error' )) return true ;
238+ return false ;
214239}
215240
216241// In analyzeChangeSet:
@@ -219,9 +244,14 @@ if (changeSet.Status === 'FAILED') {
219244 || changeSet .StatusReason ?.includes (' No updates' )) {
220245 return result ; // genuinely no changes
221246 }
222- if (isEarlyValidationFailure (changeSet .StatusReason )) {
247+ if (isRecoverableFailure (changeSet .StatusReason )) {
248+ if (! changeSet .Changes ?.length ) {
249+ // Recoverable failure but 0 Changes — treat as incomplete
250+ return { changes: [], skipped: true , skipReason: ... };
251+ }
223252 // Changes are populated despite FAILED status — fall through
224- print .warn (` Nested changeset FAILED (EarlyValidation): ${changeSet .StatusReason } ` );
253+ } else if (isRoot ) {
254+ // Root always falls through — classification happens per-nested-stack
225255 } else {
226256 // Unknown failure — treat as error, skip this stack
227257 return { changes: [], skipped: true , skipReason: ... };
@@ -256,25 +286,59 @@ changes must be filtered out.
256286The filter applies after changeset analysis and before the rollback
257287safety decision:
258288
289+ Two types of expected changes:
290+
291+ 1 . ** Direct DeletionPolicy changes** — ` Action: Modify ` ,
292+ ` Scope: ['DeletionPolicy'] ` . CFN reports DeletionPolicy as a
293+ first-class Scope value.
294+
295+ 2 . ** Cascading IAM Policy changes** — When DeletionPolicy changes on
296+ a DynamoDB table, CFN flags IAM policies referencing ` TableName.Arn `
297+ as Dynamic ResourceAttribute re-evaluations. These have
298+ ` Action: Modify ` , ` Scope: ['Properties'] ` ,
299+ ` ChangeSource: ResourceAttribute ` , ` Evaluation: Dynamic ` ,
300+ ` RequiresRecreation: Never ` , and ` CausingEntity ` matching
301+ ` *Table.Arn ` or ` *Table.StreamArn ` .
302+
259303``` typescript
260304function isExpectedLockDrift(change : ResourceChangeWithNested ): boolean {
261- // A change is expected lock drift if:
262- // 1. Action is 'Modify'
263- // 2. Scope is exactly ['DeletionPolicy'] (confirmed empirically —
264- // CFN uses 'DeletionPolicy' as a first-class Scope value)
265- // 3. Details show DirectModification of DeletionPolicy attribute
266- // with RequiresRecreation: 'Never'
267- return change .Action === ' Modify'
268- && change .Scope ?.length === 1
269- && change .Scope [0 ] === ' DeletionPolicy' ;
305+ if (change .Action !== ' Modify' ) return false ;
306+
307+ // Direct DeletionPolicy change
308+ if (change .Scope ?.length === 1 && change .Scope [0 ] === ' DeletionPolicy' ) return true ;
309+
310+ // Cascading IAM Policy change from DeletionPolicy on a DDB table
311+ if (change .ResourceType === ' AWS::IAM::Policy'
312+ && change .Scope ?.length === 1 && change .Scope [0 ] === ' Properties'
313+ && change .Details ?.length ) {
314+ return change .Details .every (d =>
315+ d .ChangeSource === ' ResourceAttribute'
316+ && d .Evaluation === ' Dynamic'
317+ && d .Target ?.RequiresRecreation === ' Never'
318+ && / Table\. (Arn| StreamArn)$ / .test (d .CausingEntity ?? ' ' )
319+ );
320+ }
321+
322+ return false ;
270323}
271324
272- const realDrift = changes .filter (c => ! isExpectedLockDrift (c ));
325+ // Recursive tree walk — CloudFormation::Stack entries are structural
326+ // wrappers; real drift is at the leaf level.
327+ function hasRealDrift(changes : ResourceChangeWithNested []): boolean {
328+ for (const change of changes ) {
329+ if (change .nestedChanges ?.length ) {
330+ if (hasRealDrift (change .nestedChanges )) return true ;
331+ } else if (change .ResourceType !== ' AWS::CloudFormation::Stack' ) {
332+ if (! isExpectedLockDrift (change )) return true ;
333+ }
334+ }
335+ return false ;
336+ }
273337```
274338
275- If ` realDrift ` is empty after filtering, lock rollback can proceed
276- safely. If there is any real drift, lock rollback must abort — the
277- environment is in an inconsistent state.
339+ If ` hasRealDrift ` returns false after filtering, lock rollback can
340+ proceed safely. If there is any real drift, lock rollback must abort
341+ — the environment is in an inconsistent state.
278342
279343### What is NOT changed
280344
@@ -310,16 +374,19 @@ update the Gen1 templates to reflect the post-migration state
310374step. Once templates are updated, EarlyValidation passes and this risk
311375disappears.
312376
313- ### R3 — Non-EarlyValidation failures conflated with EarlyValidation
377+ ### R3 — Recoverable failure classification is string-based
314378
315- If CFN introduces new failure modes whose StatusReason strings don't
316- match our ` isEarlyValidationFailure ` check, we'd incorrectly treat
317- them as hard errors (skip the stack). This is the safe direction —
318- false negatives (missed drift) are worse than false positives
319- (unnecessary skip) for the lock rollback use case.
379+ The ` isRecoverableFailure ` function matches ` "EarlyValidation" ` and
380+ ` "Template format error" ` in StatusReason strings. If CFN introduces
381+ new failure modes or changes wording, we may either miss recoverable
382+ failures (treating them as hard errors — safe direction) or
383+ incorrectly treat genuine failures as recoverable (reading incomplete
384+ Changes — unsafe direction). The latter risk is mitigated: recoverable
385+ failures with 0 Changes are treated as incomplete, not clean.
320386
321387* Mitigation* : Log all failure reasons. Expand the pattern match as
322- new failure types are observed.
388+ new failure types are observed. Empty-Changes guard ensures
389+ false-recoverable classification fails closed.
323390
324391### R4 — DeletionPolicy filter accuracy
325392
@@ -329,10 +396,21 @@ classified as expected) would allow lock rollback to proceed when the
329396environment is inconsistent. A false negative (expected drift
330397classified as real) would block lock rollback unnecessarily.
331398
332- * Mitigation* : The lock step knows exactly which resources it modified
333- and what it changed. The filter can be informed by the lock step's
334- output (list of resources + properties changed) rather than relying on
335- heuristic pattern matching on changeset details.
399+ The filter has two paths: direct DeletionPolicy scope matching (tight)
400+ and cascading IAM Policy matching (broader — checks ` ChangeSource ` ,
401+ ` Evaluation ` , ` RequiresRecreation ` , and ` CausingEntity ` pattern
402+ ` *Table.Arn|StreamArn ` ). The IAM path could theoretically false-positive
403+ if a non-DeletionPolicy change triggers an identical Dynamic
404+ ResourceAttribute re-evaluation pattern on an IAM policy referencing
405+ a resource whose logical ID ends in "Table". In practice this is
406+ extremely unlikely — the pattern only matches during lock rollback when
407+ DeletionPolicy is the only template modification.
408+
409+ * Mitigation* : The ` CausingEntity ` regex anchors to ` Table.Arn ` or
410+ ` Table.StreamArn ` . Scope is required to be exactly ` ['Properties'] ` .
411+ All Details must be Dynamic/ResourceAttribute/Never — any static or
412+ direct modification fails the filter. 28 unit tests cover positive
413+ and negative cases.
336414
337415## Consequences
338416
@@ -356,7 +434,7 @@ heuristic pattern matching on changeset details.
356434- The ` CreateChangeSetCommand ` call and its parameters
357435- The recursive traversal of nested changesets via ` ChangeSetId `
358436- The "no changes" detection path
359- - The drift formatter and its console URL generation
437+ - The drift formatter ( console URL format was fixed separately)
360438- The Phase 1 (CFN drift detection) and Phase 3 (local drift) paths
361439
362440### What gets removed
0 commit comments