Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redshift: add properties, and ability to set secret name, for DatabaseSecret and User to be valid Redshift airflow/connections/conn_id #26245

Closed
1 of 2 tasks
ann8ty opened this issue Jul 5, 2023 · 6 comments
Labels
@aws-cdk/aws-redshift Related to Amazon Redshift closing-soon This issue will automatically close in 4 days unless further comments are made. effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@ann8ty
Copy link

ann8ty commented Jul 5, 2023

Describe the feature

The DatabaseSecret creates an incomplete secret for use with Apache Airflow because it is missing properties to connect, neither does it expose properties that could be used in CDK to correct the secret. If DatabaseSecret acted more like User (both in redshift_alpha) I could make do.

If both added more properties, I wouldn't have to correct either.

If I could set the secret name, then I wouldn't need to make a copy.

Use Case

Fix interop between AWS Services: Redshift, MWAA, Secrets Manager.

Create airflow/connections/conn_id by allowing secret name to be set, and add all required properties to connect.

In order to automate privilege granting to service accounts, the super user is required. I do this as a DAG from within airflow via the Redshift SQL Hook or Operator.

Today, I can copy the User secret to an properly formatted secret with Typescript CDK, but not the DatabaseSecret.

Without this feature I will need to either manually grant some initial permissions on each deploy, or write some Terraform to run after CDK.

Proposed Solution

A: Grant ability on DatabaseSecret and User to set Secret Name as airflow/connections/conn_id

B: Add all the properties for DatabaseSecret to be a valid MWAA airflow/connections/connection_id object

  • has: username, password
  • missing: dbname, engine, port, host
  • not required, but dbClusterIdentifier is nice too

C: Expose .secret on DatabaseSecret like redshift_alpha.User does so I can add the properties myself, for example, I do this for local airflow:

  let updatedSecretValue = new SecretValue(JSON.stringify({
      dbClusterIdentifier: teamMemberRedshiftUser.cluster.clusterName,
      password: teamMemberRedshiftUser.password.toString(),
      dbname: teamMemberRedshiftUser.databaseName,
      engine: "redshift",
      port: "5439",
      host: "host.docker.internal", // hostname = docker for use with redshift tunnel + local mwaa
      username: teamMemberRedshiftUser.username
  }))
 
  // make a copy of the secret as a redshift connection
  new Secret(this, `${appEnv}-${teamMember}-team-member-account-connection`, {
      secretName: `airflow/connections/redshift_${replaceDashesToUnderscores(appEnv)}_${teamMember}`,
      secretStringValue: updatedSecretValue
  })

Other Information

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

"@aws-cdk/aws-redshift-alpha": "^2.85.0-alpha.0"

Environment details (OS name and version, etc.)

Airflow 2.5.1, Mac Ventura 13.4.1 (22F82)

@ann8ty ann8ty added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jul 5, 2023
@github-actions github-actions bot added the @aws-cdk/aws-redshift Related to Amazon Redshift label Jul 5, 2023
@pahud pahud self-assigned this Jul 5, 2023
@pahud pahud changed the title redshift_alpha: add properties, and ability to set secret name, for DatabaseSecret and User to be valid Redshift airflow/connections/conn_id redshift: add properties, and ability to set secret name, for DatabaseSecret and User to be valid Redshift airflow/connections/conn_id Jul 5, 2023
@pahud
Copy link
Contributor

pahud commented Jul 5, 2023

I am not sure if it is necessary to store dbname, engine, port, host and dbClusterIdentifier all in the secret. If not, I think you can extend DatabaseSecret into a custom class like

export interface AirflowSecretProps extends DatabaseSecretProps {
     readonly dbname: string;
     readonly engine: string;
     readonly port: string;
     readonly host: string;
     readonly identifier: string;
}
export class AirflowSecret extends DatabaseSecret {
     readonly dbname: string;
     readonly engine: string;
     readonly port: string;
     readonly host: string;
     readonly identifier: string;
   constructor(scope: Construct, id: string, props: AirflowSecretProps) {
     super(scope, id, props);
     
     this.dbname = props.dbname;
     this.engine = props.engine;
     this.host = props.host;
     this.identifier = props.identifier;
     
     new DatabaseSecret();
   }
}

@pahud pahud added p2 and removed feature-request A feature should be added or improved. labels Jul 5, 2023
@pahud pahud removed their assignment Jul 5, 2023
@pahud pahud added effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 5, 2023
@ann8ty
Copy link
Author

ann8ty commented Jul 5, 2023

@pahud I like this idea and will experiment with it.

You are correct in that you don't need all of them, but you can have one set or the other, in order to uniquely identify the Redshift cluster as per https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/redshift.html#howto-connection-redshift

Option: create AirflowSecret before Redshift. DatabaseSecret doesn't have any cluster info yet, because cluster does not exist. Doesn't seem to work. The module would need to update the secret with connection info after redshift is created. Another issue here is that the password is not an available property and doesn't resolve.

I pass the credential to Redshift on create like this:

            masterUser: {
                masterUsername: props.warehouseOptions.username,
                masterPassword: props.redshiftBaseStack.warehouseSecret.secretValueFromJson('password'),
            },

but am not able to retrieve the password using the same code later for the secret. Just get unresolved-token.

@pahud
Copy link
Contributor

pahud commented Jul 10, 2023

Hi @ann8ty

I am not an expert of Airflow but wondering is it necessary to store the connection info in the Secret?

At this moment, the DatabaseSecrect used by the Redshift cluster is managed by the Cluster construct unless you pass props.masterUser.masterPassword. It is not possible for DatabaseSecrect object to store the connection info on instantiation because it is created in prior to the CfnCluster here and has no idea about the cluster. If the connection info really has to be in a secret, I feel it probably has to be a separate Secret generated by a custom resource or alternatively let the custom resource update the managed secret to add connection info in the payload. Another option is to create a CfnParameter to store the connection info after you provision the cluster with the Cluster construct.

Just wondering is storing connection info in the Secret really necessary because looks like it would be very challenging to work it around with the existing Cluster construct.

@pahud pahud added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 10, 2023
@github-actions
Copy link

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jul 12, 2023
@ann8ty
Copy link
Author

ann8ty commented Jul 12, 2023

Yes, it is necessary to store the secret in a connection, when using secrets manager as the secrets backend, so that a DAG that grants permissions can assume it. Yes, there are workarounds.

@ann8ty ann8ty closed this as completed Jul 12, 2023
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-redshift Related to Amazon Redshift closing-soon This issue will automatically close in 4 days unless further comments are made. effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

2 participants