Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues in a new region #97

Open
NickDarvey opened this issue Nov 17, 2021 · 6 comments
Open

Issues in a new region #97

NickDarvey opened this issue Nov 17, 2021 · 6 comments

Comments

@NickDarvey
Copy link

I'm trying to set up our organization to work in af-south-1, a relatively new region, but org-formation times-out and fails while registering NoDefaultVpcRp:

ERROR: Workload NoDefaultVpcRp in 1234/af-south-1 updated failed. reason: Account seems stuck initializing. (1234 = MyOldAccount)
ERROR: Workload NoDefaultVpcRp in 5678/af-south-1 updated failed. reason: Account seems stuck initializing. (5678 = MyOtherAccount)
...

I see that af-south-1 is not a 'known' region, but do any of these community resource providers depend on knowing the regions? It didn't seem like NoDefaultVpcRp does. Do you have any tips for investigating this further?

@OlafConijn
Copy link
Member

The cli should use the list of known regions to log a warning (in case someone misspelled a region) - no more. I'll add the region regardless, thanks for pointing this out.

As AWS Services gets bootstrapped when creating a new account, during the bootstrapping process there is a number of errors that get thrown. there is a retry and wait period that should fix this, apparently it did not.

https://github.com/org-formation/org-formation-cli/blob/d55936a472a9f11d3c126e965ed1e4a3204a4e61/src/util/aws-util.ts#L387-L390

What I would be interested in is to see whether there are CloudFormation stacks that have failed or are stuck updating.

I might also be that retrying this once more later on could solve the issue. maybe there was a glitch on the AWS side of things?

thanks

@NickDarvey
Copy link
Author

NickDarvey commented Nov 17, 2021

I waited ~18 hours and tried again with the same result, but, that aws-util.ts code gave me a hint: InvalidClientTokenId.

STS tokens from the global endpoint don't work in af-south-1 by default which I can demonstrate with something simple:

~\source\repos\example ≢* +4 ~16  3  796ms
❯ aws sts get-caller-identity --profile Operator-Workspace --region us-east-1
{
    "UserId": "XYZ:nick@example.com",
    "Account": "1234",
    "Arn": "arn:aws:sts::1234:assumed-role/AWSReservedSSO_Operator_1234/nick@example.com"
}


~\source\repos\example ≢* +4 ~16  3  1.618s
❯ aws sts get-caller-identity --profile Operator-Workspace --region af-south-1

An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid

Following the guidance from that AWS doc and enabling all regions for STS tokens meant I could now:

~\source\repos\example ≢* +4 ~16  3  796ms
❯ aws sts get-caller-identity --profile Operator-Workspace --region af-south-1
{
    "UserId": "XYZ:nick@example.com",
    "Account": "1234",
    "Arn": "arn:aws:sts::1234:assumed-role/AWSReservedSSO_Operator_1234/nick@example.com"
}

and deploy my org-formation:

INFO: Workload NoDefaultVpcRp in 1234/af-south-1 updated successful. (1234 = ManagementAccount)

Thanks for the hint @OlafConijn!

@NickDarvey
Copy link
Author

NickDarvey commented Nov 18, 2021

After applying the workaround described in org-formation/org-formation-cli#292, I am now running into:

ERROR: Workload NoDefaultVpcRp in 1234/af-south-1 updated failed. reason: User: arn:aws:sts::1234:assumed-role/OrganizationFormationBuildAccessRole/OrganizationFormationBuild is not authorized to perform: cloudformation:UpdateStack on resource: arn:aws:cloudformation:af-south-1:1234:stack/community-organizations-nodefaultvpc-resource-role/* with an explicit deny in a service control policy (1234 = WorkspaceAccount)
User: arn:aws:sts::1234:assumed-role/OrganizationFormationBuildAccessRole/OrganizationFormationBuild is not authorized to perform: cloudformation:UpdateStack on resource: arn:aws:cloudformation:af-south-1:1234:stack/community-organizations-nodefaultvpc-resource-role/* with an explicit deny in a service control policy
AccessDenied: User: arn:aws:sts::1234:assumed-role/OrganizationFormationBuildAccessRole/OrganizationFormationBuild is not authorized to perform: cloudformation:UpdateStack on resource: arn:aws:cloudformation:af-south-1:1234:stack/community-organizations-nodefaultvpc-resource-role/* with an explicit deny in a service control policy
    at Request.extractError (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\protocol\query.js:50:29)
    at Request.callListeners (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\sequential_executor.js:106:20)
    at Request.emit (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\sequential_executor.js:78:10)
    at Request.emit (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\request.js:688:14)
    at Request.transition (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\request.js:22:10)
    at AcceptorStateMachine.runTo (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\state_machine.js:14:12)
    at node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\state_machine.js:26:10
    at Request.<anonymous> (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\request.js:38:9)
    at Request.<anonymous> (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\request.js:690:12)
    at Request.callListeners (node_modules\aws-sdk@2.949.0\node_modules\aws-sdk\lib\sequential_executor.js:116:18)

This occurs for every account except for my root/management account.

Looking at the SCPs in the AWS Organization I can see two: DenyLargeEC2Instances and DenyUnsupportedRegions which has the contents:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": [
            "ap-southeast-1",
            "ap-southeast-2",
            "us-east-1"
          ]
        }
      },
      "Resource": "*",
      "Effect": "Deny",
      "NotAction": [
        "acm:*",
        "budgets:*",
        "chatbot:*",
        "cloudfront:*",
        "iam:*",
        "sts:*",
        "kms:*",
        "route53:*",
        "route53domains:*",
        "route53resolver:*",
        "organizations:*",
        "support:*",
        "waf:*",
        "wafv2:*"
      ],
      "Sid": "DenyUnsupportedRegions"
    }
  ]
}

Notably it does not deny cloudformation:UpdateStack. (It doesn't contain af-south-1 yet because the SCPs are deployed after the 'types' in org-formation-reference.)

Do you have any tips for diagnosing this?

@NickDarvey NickDarvey reopened this Nov 18, 2021
@OlafConijn
Copy link
Member

right ~ I think the order of these tasks need to be changed in the reference. indeed.
what you can do to work around this is:

  • log into the management account, navigate to organizations, then service control policies
  • find the service control policy called "DenyUnsupportedRegions" and manually add af-south-1 to the list of regions (under aws:RequestedRegion), save the SCP
  • rerun org formation.

looking forward to hear whether that got you unstuck. I think this is a great gotcha, will make sure it'll get fixed in the reference project.

@NickDarvey
Copy link
Author

NickDarvey commented Nov 21, 2021

Success!

INFO: Executing: register-type NoDefaultVpcRp.
INFO: Workload NoDefaultVpcRp in 1234/af-south-1 updated successful. (1234 = Account1)
INFO: Workload NoDefaultVpcRp in 5678/af-south-1 updated successful. (5678 = Account2)
INFO: Workload NoDefaultVpcRp in 1337/af-south-1 updated successful. (1337 = Account3)
...
INFO: Workload NoDefaultVpcRp in 0000/af-south-1 updated successful. (0000 = AccountN)

So I guess one this NoDefaultVpcRp provider is relying on one of the resources described in the DenyUnsupportedRegions SCP?

@sshvetsov
Copy link
Contributor

sshvetsov commented Dec 26, 2022

I've also encountered the error when trying to register the Community::Organizations::NoDefaultVPC resource provider in the non-default ap-southeast-3 region withregister-type task:

ERROR: Workload NoDefaultVpcRpInOptedInRegions in 1234/ap-southeast-3 updated failed. reason: Account seems stuck initializing. (1234 = Account1)

As part of my testing, I've managed to install the resource provider using AWS CLI, so the problem appears to be OFN.

If I understand the cause correctly, it's because OFN is trying to use the global STS endpoint (sts.amazonaws.com) when assuming a role in non-default regions instead of the regional one (sts.ap-southeast-3.amazonaws.com). CMIIW.

Is there a plan to make OFN use regional STS endpoints or should we rely on the workaround of manually setting the version of the global endpoint token to v2Token?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants