Retry TestCustomAuthorizerApp deployment on transient IAM role propagation failure#2451
Draft
GarrettBeatty wants to merge 1 commit into
Draft
Retry TestCustomAuthorizerApp deployment on transient IAM role propagation failure#2451GarrettBeatty wants to merge 1 commit into
GarrettBeatty wants to merge 1 commit into
Conversation
…ation failure The TestCustomAuthorizerApp integration test stack deploys many Lambda functions that reference IAM roles created in the same stack. CloudFormation occasionally calls Lambda CreateFunction before the role's trust policy has propagated through IAM, producing "The role defined for the function cannot be assumed by Lambda" and rolling the whole stack back, which fails all 20 tests in the project. Wrap the deploy in a retry loop (3 attempts). Between attempts, delete the rolled-back stack (a ROLLBACK_COMPLETE stack cannot be re-created) and pause briefly to let IAM settle. Surface CloudFormation failed-resource events on each failure for easier debugging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
durabletesting3CI job (run 28258443522) failed in theTestCustomAuthorizerApp.IntegrationTestsproject — all 20 tests failed because the fixture's CloudFormation deployment rolled back:This is a transient IAM eventual-consistency race: the stack creates a per-function IAM role and immediately calls Lambda
CreateFunction, but the role's trust policy hasn't propagated through IAM yet. Once one function fails, CloudFormation cancels the other in-flight resources and rolls the whole stack back (ROLLBACK_COMPLETE), failing every test in the project. It is unrelated to the PR's code changes and typically passes on re-run.Fix
Wrap
dotnet lambda deploy-serverlessinDeploymentScript.ps1with a retry loop (3 attempts):ROLLBACK_COMPLETEstack cannot be updated or re-created — andwait stack-delete-completebefore retrying.This makes the integration test resilient to the transient IAM propagation failure instead of failing the whole CI job.
Testing
[Parser]::ParseFile).break).