Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Multiple onefuzz deployments under single subscription #1297

Closed
uday-infosec opened this issue Sep 28, 2021 · 14 comments
Closed

Multiple onefuzz deployments under single subscription #1297

uday-infosec opened this issue Sep 28, 2021 · 14 comments

Comments

@uday-infosec
Copy link

I have been trying to deploy different versions of onefuzz (3.0.0 and 2.16.0) under a single subscription. The deployment succeeds (I do change the client app registration name i.e: instead of onefuzz-cli, I name it version specific eg: onefuzz-cli-3-0-0). When I install the cli and configure it, i am not able to use the cli to create pools, create vms and schedule jobs. I get the below error:

ERROR:cli:command failed: error: invalid_client
'AADSTS7000215: Invalid client secret is provided.
Trace ID:
Correlation ID:
Timestamp: 2021-09-28 20:43:14Z'

Can I have multiple instances of onefuzz running under the single subscription ? If yes, could you let me know why my cli is failing to connect with the instance ?

@uday-infosec uday-infosec added the bug Something isn't working label Sep 28, 2021
@ghost ghost added the Needs: triage label Sep 28, 2021
@chkeita
Copy link
Contributor

chkeita commented Sep 28, 2021

Yes, it is possible to deploy multiple instances under the same subscription.

Do you have a client_secret in your config? You can run onefuzz config and check if client_secret is set
If that is the case, you can rerun the command to set your config with the reset option
onefuzz config --endpoint <endpoint> --authority <authority> --client_id <client_id> --reset true
This will remove the client_secret value

@uday-infosec
Copy link
Author

Yea, I have tried that. But with deployment of 3.0.0. I have been getting the below error when i execute onefuzz versions check --exact

WARNING:nsv-backend:failed to get access token with scope https://<>/.default
ERROR:cli:command failed: error: invalid_resource
The resource principal named <> was not found in the tenant named Default Directory. This can happen if the application has not been installed by the administrator of the tenant or consented to by any user in the tenant. You might have sent your authentication request to the wrong tenant.

This is a normal deployment with no permission issues during deployment. I am the owner of the subscription and the account.

@chkeita
Copy link
Contributor

chkeita commented Sep 28, 2021

are you using a client_secret in your config?

@ranweiler
Copy link
Member

ranweiler commented Sep 29, 2021

@stmh-infosec, you shouldn't have to rename the onefuzz-cli app registration. The original should have worked fine for all instances in the tenant, and its existence should not have caused a deployment error. If it caused errors on your first deployment attempts, and that's why you renamed it, that is noteworthy.

Please share the exact order of operations you did, e.g.

  1. Deployed 2.16 instance
  2. Changed name of onefuzz-cli app registration
  3. Deployed 3.0 instance

(or re-arranged as appropriate, depending on what you did)

Also, could you please share the the redacted output of onefuzz config? The only part we need to see are which keys are present, not the values.

Suppose it looks something like this:

{
    "authority": "https://login.microsoftonline.com/<AUTHORITY>",
    "client_id": "<CLI_CLIENT_ID>",
    "client_secret": "<REDACTED>",
    "endpoint": "https://<INSTANCE>.azurewebsites.net"
}

(I will refer to the actual values below using the placeholder strings above)

Big picture: for the CLI client to work, we expect that there is an App Registration named <INSTANCE>, and that it has a pre-authorized app registration in the same tenant named onefuzz-cli. The onefuzz-cli app registration should have an Application ID that matches CLI_CLIENT_ID in your config. It is fine (and expected) if many distinct instance app registrations in the same tenant use the same onefuzz-cli app registration. I will describe how to check this below.

Each app registration has a "manifest", accessible via the "Manifest" menu item in the "Manage" section of the Resource Menu (left pane) of the "App registrations" blade in the Azure Portal. The manifest is a JSON document, and it has a key named preAuthorizedApplications (usually near the end). The value of that key should be a list of JSON objects, and each object should have an appId. The appId is a UUID, and it should match the Application ID of the onefuzz-cli app registration, which should be configured as the client_id in your local OneFuzz CLI config. You should verify that the pre-authorized application exists in your AAD tenant. You can do this in the portal, or via az-cli, via az add app show --id <CLI_CLIENT_ID>.

Can you please check which app registrations are present in preAuthorizedApplications for each of your instances, and compare those values to client_id in your config?

Also, in your last comment, I take it that you redacted a UUID here:

The resource principal named <> was not found in the tenant named Default Directory. This can happen if the application has not been installed by the administrator of the tenant or consented to by any user in the tenant. You might have sent your authentication request to the wrong tenant.

Could you try running both az ad app show --id $ID and az ad sp show --id $ID for that redacted UUID, and tell us if you get results or not (you don't need to share them). I'm interested in if the resource is really present in your tenant or not.

@uday-infosec
Copy link
Author

There is also this weird error that arises every time I deploy version 3.0.0.

Issue 1:
Version: 3.0.0

Steps to reproduce:

  1. Enable virtualenv, install requirements and run deploy script.
  2. The deployment succeeds, but with the below error:

ERROR:azure.cosmosdb.table.common.storageclient:Client-Request-ID=bfa3e534-20f6-11ec-b4c3-a16fd243789e Retry policy did not allow for a retry: Server-Timestamp=Wed, 29 Sep 2021 07:27:48 GMT, Server-Request-ID=91a4b6e8-b002-0055-4d03-b5bf84000000, HTTP status code=404, Exception=Not Found{"odata.error":{"code":"ResourceNotFound","message":{"lang":"en-US","value":"The specified resource does not exist.\nRequestId:91a4b6e8-b002-0055-4d03-b5bf84000000\nTime:2021-09-29T07:27:49.3457071Z"}}}.

Issue2:
Version: 3.0.0

Steps to reproduce:

  1. Enable virtualenv, install requirements and run deploy script.
  2. The deployment succeeds, but with the above error, proceed with using the cli version 3.0.0
  3. Create Client Secret under onefuzz-cli app registration.
  4. Use the config command along with the client_id, client_secret, endpoint and authority. The config succeeds with the below message:
{
    "authority": "https://login.microsoftonline.com/<<Redacted>>",
    "client_id": "<<Redacted>>",
    "client_secret": "***",
    "endpoint": "https://<<Redacted>>.azurewebsites.net",
    "features": []
}
  1. Now, since the config was able to execute, we try executing the onefuzz pools list command, but end up getting the below error
    WARNING:nsv-backend:failed to get access token with scope https://<<redacted>>.azurewebsites.net/.default ERROR:cli:command failed: error: invalid_resource 'AADSTS500011: The resource principal named https://<<redacted>>.azurewebsites.net was not found in the tenant named Default Directory. This can happen if the application has not been installed by the administrator of the tenant or consented to by any user in the tenant. You might have sent your authentication request to the wrong tenant. Trace ID: 1db....b5f3-fca5ea2 Correlation ID: 22....44f Timestamp: 2021-09-29 07:50:03Z'

This has nothing to do with multiple deployments, this is just the case with version 3.0.0 deployment. I see a pull request of the fix here(#1300), but not sure if that would be available as part of release binary. (the latest release was downloaded and deployed)

@uday-infosec
Copy link
Author

uday-infosec commented Sep 29, 2021

@ranweiler , for your above commands on az ad app show --id $ID and az ad sp show --id $ID. When I query the UUID which starts with https:// , i get Application 'https://onefuzz<<redacted>>.azurewebsites.net' doesn't exist.

However, when i replace the https:// with the api:// which I noticed has changed in 3.0.0, I get the json blob containing the details. I predict, https:// vs api:// might be the issue.

@ranweiler
Copy link
Member

Thanks for the details, @stmh-infosec!

In the first situation, where you deploy, but do not create and configure a client_secret of your own, are you able to successfully invoke CLI commands like onefuzz jobs list?

@uday-infosec
Copy link
Author

You mean Issue 1 trying to use the config command without client_secret after the deployment succeeds with the error ? Whenever I get this(Issue 1) error, I have been re-running deployment script with the --upgrade option and it goes through without the error and then using the cli with the client_secret gets me to Issue 2. I haven't tried using the cli commands with the Issue 1 error yet, can give it a try if needed.

@ranweiler
Copy link
Member

You mean Issue 1 trying to use the config command without client_secret after the deployment succeeds with the error ?

Exactly. What I'm trying to validate is, in your setup, are you able to get any fully-working deployment, using the standard device code auth flow (not client secret). In other words, even if your initial attempt at deploying stops with an error, and you finish it by re-running with --upgrade, are you able to successfully run onefuzz info get or onefuzz jobs list? So it'd help if you could try those CLI commands after a successful deployment, but before adding a client_secret.

then using the cli with the client_secret gets me to Issue 2

Understood. I do think that, separately, you're probably running into #1299, which @chkeita is addressing in #1300.

@uday-infosec
Copy link
Author

Here they are:

  1. Resetting using the onefuzz config --reset=true
WARNING:onefuzz:endpoint not configured yet
{
    "authority": "",
    "client_id": "",
    "features": []
}
  1. Configuring using just the client id:
onefuzz config --endpoint https://onefuzz...azurewebsites.net --authority https://login.microsoftonline.com/6da...9ef8 --client_id 39....a8
{
    "authority": "https://login.microsoftonline.com/6d...f8",
    "client_id": "39...a8",
    "endpoint": "https://onefuzz....azurewebsites.net",
    "features": []
}
  1. Using onefuzz pools list command: (This happens after I enter the given code to authenticate)
onefuzz pools list
Please login
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code ... to authenticate.
ERROR:cli:command failed: error: invalid_client
AADSTS7000218: The request body must contain the following parameter: 'client_assertion' or 'client_secret'.
Trace ID: ea...00
Correlation ID: 0b...df
Timestamp: 2021-09-29 17:52:34Z

@uday-infosec
Copy link
Author

Also Curious, the builds should have failed for version 3.0.0 if this kind of issues arise right ? the github actions pipeline had both the deployment and verification of cli calling the endpoint. Does it actually check for the exact response compatible ?

@ranweiler
Copy link
Member

builds should have failed for version 3.0.0 if this kind of issues arise right?

Yes, our integration tests should catch this in some form, but I think this is a specific combination of issues that might not be covered.

IIUC, based on what you're reporting, you cannot authenticate using the device code flow. I don't know why that is (it has worked in our integration tests). But on top of that, confidential client auth (client ID + secret) is legitimately broken, and so you can't fall back on that, either.

For your 3.0.0 instance, for device code auth to work, I'd expect at least the following 3 things to be true:

  1. Your CLI config has NO client_secret value, but it does have a client_id
  2. The UUID value of your CLI config's client_id matches the UUID value of the onefuzz-cli app registration object in the same tenant
  3. In the manifest of the app registration for the instance, the preAuthorizedApplications property includes a well-formed reference to the same onefuzz-cli app that is identified via client_id in your CLI config.

So, based on the config snippet you shared above, I'd expect to see the following in the manifest of the app registration for your 3.0.0 instance:

	"preAuthorizedApplications": [
		{
			"appId": "39...a8",
			"permissionIds": [
				"<uuid-that-isnt-relevant>"
			]
		}
	],

Can you please confirm that?

@chkeita
Copy link
Contributor

chkeita commented Sep 29, 2021

@stmh-infosec regarding issue1. The error message comes from one of our dependencies. The exception generated by this message is handled internally but logged by default. So your deployment is successful even in the presence of that error message.
I create #1304 to disable this logging

@ghost
Copy link

ghost commented Oct 9, 2021

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@ghost ghost closed this as completed Oct 12, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 11, 2021
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants