-
Notifications
You must be signed in to change notification settings - Fork 575
fix: always return PhysicalResourceID for CFn CustomResources #613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,7 +10,6 @@ import ( | |
| "net/http" | ||
|
|
||
| "github.com/aws/aws-lambda-go/events" | ||
| "github.com/aws/aws-lambda-go/lambdacontext" | ||
| ) | ||
|
|
||
| // CustomResourceLambdaFunction is a standard form Lambda for a Custom Resource. | ||
|
|
@@ -29,11 +28,19 @@ func lambdaWrapWithClient(lambdaFunction CustomResourceFunction, client httpClie | |
| fn = func(ctx context.Context, event Event) (reason string, err error) { | ||
| r := NewResponse(&event) | ||
|
|
||
| // A previous physical resource id exists unless this is a create request. | ||
| fallbackPhysicalResourceID := event.PhysicalResourceID | ||
| if event.RequestType == RequestCreate { | ||
| // If this is a create request, the fallback should be the request ID | ||
| fallbackPhysicalResourceID = event.RequestID | ||
| } | ||
|
|
||
| funcDidPanic := true | ||
| defer func() { | ||
| if funcDidPanic { | ||
| r.Status = StatusFailed | ||
| r.Reason = "Function panicked, see log stream for details" | ||
| r.PhysicalResourceID = fallbackPhysicalResourceID | ||
| // FIXME: something should be done if an error is returned here | ||
| _ = r.sendWith(client) | ||
| } | ||
|
|
@@ -42,17 +49,17 @@ func lambdaWrapWithClient(lambdaFunction CustomResourceFunction, client httpClie | |
| r.PhysicalResourceID, r.Data, err = lambdaFunction(ctx, event) | ||
| funcDidPanic = false | ||
|
|
||
| if r.PhysicalResourceID == "" { | ||
| r.PhysicalResourceID = fallbackPhysicalResourceID | ||
| log.Printf("PhysicalResourceID not set. Using fallback PhysicalResourceID: %s\n", r.PhysicalResourceID) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remembered #237, which resulted in a revert of a similar change. As I understand from re-reading that, if a create failure status response contains a physical resource id, CloudFormation will issue a followup delete request. Which makes sense, however, some existing setups might not handle a delete with an unrecognized physical resource id. And such a request today would not be issued, as CloudFormation instead swallows the creation failure with "invalid physical resource id". Which sucks, because it means resolving the related #107 means re-introducing these (unexpected for some) deletes on create errors.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
As far as I could see, CFn always issues a delete request if create failed regardless of whether the The problem with #237 was that, for DELETE request, a new Actually, this may tie back to the discussion on whether
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm surprised #237 saw delete fail because of different Physical Resource ID. I wonder how/why they saw it. I didn't. Maybe CFn behavior changed in all this time, maybe it had to do with their logic (which I agree we should still support). Actually, what they described is similar to the one time I saw my stack get stuck and then failed to replicate it. It may be that I just need to think more about the methodology for testing delete.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My assumption about about a delete not being issued is wrong. Or at least, it's wrong today. Perhaps there was a behavior change since 2019 🤷♂️ When I try to validate this That error infact, does occur, a delete is issued, and we can see that in the stack events, CloudFormation came up with it's own PhysicalResourceId to pass in a delete when we failed to provide a valid one. Notice below, after the create fails for our invalid physical resource id, CloudFormation starts plumbing something else.
I suspect whatever issue occurred for this customer, (assuming the revert resolved this issue), could have began re-occuring at some point, but with that funny logical id looking thing rather than the funny log stream id that got reverted.
This feels familiar to me too, for an app I had where the API call was timing out. I'll comment on #107 with this context too. #108 could probably be re-applied (either directly, or as a part of this PR's changes) given this observation.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
After reading more, I suspect #237 was a user-error. They were facing the issue all the way back in 2016 (they referenced this: remind101/empire/918). This was before #108 was introduced. CFn behavior was the same back then. As you noted, according to remind101/empire/918, it seems that even if we don't return a |
||
| } | ||
|
|
||
| if err != nil { | ||
| r.Status = StatusFailed | ||
| r.Reason = err.Error() | ||
| log.Printf("sending status failed: %s", r.Reason) | ||
| log.Printf("sending status failed: %s\n", r.Reason) | ||
| } else { | ||
| r.Status = StatusSuccess | ||
|
|
||
| if r.PhysicalResourceID == "" { | ||
| log.Println("PhysicalResourceID must exist on creation, copying Log Stream name") | ||
| r.PhysicalResourceID = lambdacontext.LogStreamName | ||
| } | ||
| } | ||
|
|
||
| err = r.sendWith(client) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking falling back to the LogStreamName for
RequestCreateis more backwards compatible.However, switching to RequestId may be better for correctness with multiple custom resources in the same stack. There may well be another bug lurking here that needs fixing. But that could also land as a separate PR.