Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pipelines): revised version of the API #12326

Merged
merged 89 commits into from
Jul 15, 2021

Conversation

rix0rrr
Copy link
Contributor

@rix0rrr rix0rrr commented Jan 4, 2021

Add a new, modernized API to the pipelines library.

Advantages of the new API are:

  • Removes the need to interact with the underlying AWS CodePipeline library for Artifacts and Sources
  • A streamlined API for sources (more sensible defaults allowing you to specify less)
  • Synth classes hide less from you, allowing you more control and remove the need to decide whether or not to "eject" from the convenience classes of the original API
  • Supports parallel deployments (speeding up large pipelines)
  • Supports stages of >25 stacks
  • Supports multiple sources powering the build
  • Gives more control over the CodeBuild projects that get generated

In addition, by clearly separating out generic parts of the library from CodePipeline/CodeBuild-specific parts, allows easier development of construct libraries that target alternative deployment systems while reusing large parts of the logic of this library.

This does not remove or deprecate the old API, though starting today its use is discouraged in favor of the new API, which will see more development in the future.

Closes #10872.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Jan 4, 2021

@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Jan 4, 2021
@mergify
Copy link
Contributor

mergify bot commented Jan 4, 2021

Title does not follow the guidelines of Conventional Commits. Please adjust title before merge.

@skinny85
Copy link
Contributor

skinny85 commented Jan 6, 2021

@rix0rrr I'm looking at this, and one thing that would definitely help me is seeing some tests - maybe the snapshot-style integ tests that we use? But anything you can get working quickly would be helpful.

I'm not interested in the assertions, I want to see how the library is meant to be used, and play with it myself 🙂.

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Jan 11, 2021

@rix0rrr I'm looking at this, and one thing that would definitely help me is seeing some tests - maybe the snapshot-style integ tests that we use? But anything you can get working quickly would be helpful.

@skinny85 I added one.

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is quite huge, and I haven't processed all of it yet. But here's an early brain-dump of my thoughts as I learn this code 🙂.


I think the key here is to layer all of the pieces correctly. I also think deciding what dependencies does each layer have will be crucial.

Layer 0: the graph part

This is the foundation (BTW, I wonder whether "Flow network" wouldn't be a better abstraction for this than just graphs, with changes "flowing" through the network, starting with the sources. We don't need weights for the edges, but I think that's fine to not use that part).

I think it should have 0 dependencies, maybe some utility libraries. It could possibly depend on constructs, if there's something useful there, but it shouldn't depend on aws-cdk.

Layer 1: the frontend/backend part

This is where things like Source, Synth, etc. should live. And also the Backend interface.

It should depend on aws-cdk (probably only on @aws-cdk/core), on the graph part, and have a class like CdkDeploymentFlow (name TBD of course - I basically want to avoid using the term Pipeline here, to make it absolutely clear this is not tied to CodePipeline).

Layer 2: backend-specific APIs

This is where our CdkPipeline lives. I even wonder, although it might be crazy, whether we'll be able to keep the same API we have today for it, just implemented in terms of the lower layers. I understand this is a very, very ambitious ask though.

This will depend on the @aws-cdk/aws-codepipeline and @aws-cdk/aws-codepipeline-actions modules from CDK.


Given that, here's my comments to the current version:

  1. I would kick out the cloudformation actions from the graph part. I don't think they belong there. Perhaps we need a more generic "CDK deployment" concept in that layer 0? I think cloudformation actions split between "prepare" and "execute" make sense in layer
  2. I would make Backend an interface that classes implement. I wouldn't make it a union-like class - I think that sends the wrong message (it makes it seems like want to control how many backends there are, which I think we shouldn't do).
  3. I would make backend required in CdkDeploymentFlow (the one in Layer 1).
  4. Then, in Layer 2, I would have a different class that wraps CdkDeploymentFlow. Something like this:
export class CdkPipeline extends Construct {
  constructor(scope: Construct, id: string, props: CdkPipelineProps) {
    super(scope, id);

    this.cdkDeploymentFlow = new CdkDeploymentFlow({
      ...,
      backend: new CodePipelineBackend(),
    });
  }

  public addAppicationStage(stage: core.Stage, options: core.AddApplicationOptions): void {
    this.cdkDeploymentFlow.addDeployment(...);
  }

  protected onPrepare() {
    this.cdkDeploymentFlow.renderToBackend();
  }
}

And a similar one for CdkGitHubActions, etc. Basically, I don't think customers should be even aware of backends, and they shouldn't need to provide one themselves - the "wrapping" class should take care of it.

  1. I feel like having a default Token for authentication in GitHubSource is probably not what you want. How about we make it optional, and the backend will scream at you if you need to provide it? Then, we can have separate Layer 2 Source classes that just wrap the Layer 1 ones correspond to the appropriate CodePipeline sources, and which require this property? Probably using the cdk.SecretValue type?

That's all I got so far. I still feel like I've only scratched the surface of this change 🙂.

I also have no idea what happened with the build - the package actually builds fine on my machine (all of the tests pass)...


- How to specify multiple sources, and where they go in the build? Do we just add the concept
of Artifacts back into the user model, but optional?
- Similar for the sources for a shell script action.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub actions can also have multiple sources (kind of), through the checkout action.

Comment on lines 4 to 5
- How to specify multiple sources, and where they go in the build? Do we just add the concept
of Artifacts back into the user model, but optional?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably won't escape this. For example, GitHub Actions have artifacts too.

Comment on lines 7 to 10
- Specifying secrets for getting Git sources. Do we just say "secret name" and it's up to the backend
what that means for where secrets will live? (SecretsManager secret in AWS, Env var in GitHub Actions,
env var in Jenkins, etc?) Or potentially, do we lift the concept of "secrets manager" secret to the main
API and in GHA/Jenkins we have to do an API call using "current" creds to get it?
Copy link
Contributor

@skinny85 skinny85 Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we need our own notion of a Secret in the API on Layer 1, with multiple implementations, possibly depending on the backend (using cdk.SecretValue as a wrapper class, GitHub secrets, etc.).

Comment on lines 13 to 14
- Specifying images (ECR, DockerHub, auth...). Implies dependencies on AWS CDK libraries (`aws-ecr`, `aws-ecr-assets`).
How does an asset image even deploy for GitHub Actions?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we come up with our own interface for a Docker Image abstraction for this on Level 1? We'll support only the basics (public registries, plus optional credentials for docker login). Then, on Layer 2, we'll add a CDK-specific interface that depends on @aws-cdk/aws-ecr, etc.

Comment on lines 11 to 12
- CodeBuild Project tweaks like execution role permissions -- where do they go? It doesn't make sense for
anything other than AWS, but that's backend-specific and you'd like to specify this at the front end.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for these cases, we will need backend-specific types on Layer 2.

Comment on lines 16 to 18
- Rollout problem! How do we introduce this without breaking existing users? The "prime real estate name"
of `@aws-cdk/pipelines.CdkPipeline` is already taken. I guess we're going to have to either rename the
class (while deprecating the old classes), or rename the module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a way to do this in a backwards-compatible way, although it will be hard. See my main comment 🙂.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to see how as we're making the classes much more implicit, and not dependent on built-in types anymore.

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Jan 18, 2021

Regarding your comments on API organization:

I agree with most of what you're saying, on a technical basis, and I've been pondering the same things. At the moment I've decided not to do these because I've been optimizing for other things:

Layer 0/Layer 1/Layer 2

I see the purity and you're not wrong that these things are technically separate. Haven't done it at this point because I was optimizing for a small dependency footprint. Since (IMO) CDK Pipelines should be its own L3, I'd prefer it to be one dependency. Maybe that's optimizing for the wrong things, maybe it's a non-issue because of NPM7, maybe it's a non-issue because we can just build monopipelines using a build trick. But that's the reason they're together right now.

I wonder whether "Flow network" wouldn't be a better abstraction for this than just graphs [...] We don't need weights

This term is new to me, but from the linked Wikipedia article it seems that FlowNetwork ≡ Graph + Weights. If you want to use the abstraction FlowNetwork - Weights ≡ Graph + Weights - Weights, then I don't see how that's different from Graph.

If this is just about calling the graph "network" instead, then sure. I don't mind that.

I basically want to avoid using the term Pipeline here

To my mind, "CDK Pipelines" was the pluggable framework product. In your opinion, "CDK Pipelines" should mean using the pluggable framework with a CodePipeline backend.

From a marketing/comprehension point of view I think it's easier to explain:

  • CDK Pipelines is a deployment framework with pluggable backends such as CodePipeline, GitHub Actions and others.

Than it is to explain:

  • CDK Deployment Flow is a deployment framework with pluggable backends. CDK Pipelines is the CodePipeline backend for CDK Deployment Flow.
  • (or, alternatively) CDK Pipelines is an CI/CD mechanism using CodePipeline, built on top of CDK Deployment Flow. CDK Deployment Flow can also be used with other backends, such as GitHub Actions.

Maybe I'm being overly obtuse, and we just don't explain it at all and that might be fine too.

I would kick out the cloudformation actions from the graph part.

There are no specific CloudFormation actions in the graph part, or at least there shouldn't be.

What there does exist is currently a superset of all possible actions, and a GitHub source might translate into a cpactions.GitHubSourceAction for CodePipeline or a checkout step in GitHub Actions.

There might be something to be said for tying the sources to the backend, though, I do see the value in the simplicity that brings.

I feel like having a default Token for authentication in GitHubSource is probably not what you want

I actually feel like it is, from my experience of building these pipelines I have one token with a default name, regardless of the number of pipelines or repositories. What's the point of having to write it out every time? We can set a well-advertised standard here, and if you disagree you can always still specify the name. Notice how bleeding nice the interface becomes if we just set a number of (pretty reasonable) default expectations.

[Artifacts] I think we probably won't escape this

I guess so. We can still make them as optional as possible, though.

@gitpod-io
Copy link

gitpod-io bot commented May 27, 2021

Copy link
Contributor

@eladb eladb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing. Love the new API!

Adding do-not-merge for last minute changes per your discretion.

packages/@aws-cdk/pipelines/README.md Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/pipelines/README.md Outdated Show resolved Hide resolved
@eladb eladb added pr/do-not-merge This PR should not be merged at this time. and removed pr-linter/exempt-test The PR linter will not require test changes labels Jul 14, 2021

const pipeline = new CodePipeline(this, 'Pipeline', {
synth: new ScriptStep('Synth', {
// Will use a SecretsManager secret named 'github-token' by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason this does not use GitHubv2 access? Token-based auth is deprecated afaik.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gimme a link please? This is the first I'm hearing of this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@rix0rrr rix0rrr added the pr-linter/exempt-test The PR linter will not require test changes label Jul 14, 2021
@rix0rrr rix0rrr removed the pr/do-not-merge This PR should not be merged at this time. label Jul 15, 2021
@mergify
Copy link
Contributor

mergify bot commented Jul 15, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: 3b3e1c0
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit 165ee3a into master Jul 15, 2021
@mergify mergify bot deleted the huijbers/tryout-pipelines-refactor branch July 15, 2021 10:07
@mergify
Copy link
Contributor

mergify bot commented Jul 15, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@twitu
Copy link

twitu commented Jul 15, 2021

🎉 Awesome feature release. I've been following the pipelines module development from sometime and I'm glad to have this much needed overhaul. Thanks for accommodating all the various feature requests.

TikiTDO pushed a commit to TikiTDO/aws-cdk that referenced this pull request Aug 3, 2021
Add a new, modernized API to the `pipelines` library.

Advantages of the new API are:

- Removes the need to interact with the underlying AWS CodePipeline library for `Artifacts` and `Sources`
- A streamlined API for sources (more sensible defaults allowing you to specify less)
- `Synth` classes hide less from you, allowing you more control and remove the need to decide whether or not to "eject" from the convenience classes of the original API
- Supports parallel deployments (speeding up large pipelines)
- Supports stages of >25 stacks
- Supports multiple sources powering the build
- Gives more control over the CodeBuild projects that get generated

In addition, by clearly separating out generic parts of the library from CodePipeline/CodeBuild-specific parts, allows easier development of construct libraries that target alternative deployment systems while reusing large parts of the logic of this library.

This does not remove or deprecate the old API, though starting today its use is discouraged in favor of the new API, which will see more development in the future.

Closes aws#10872.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
hollanddd pushed a commit to hollanddd/aws-cdk that referenced this pull request Aug 26, 2021
Add a new, modernized API to the `pipelines` library.

Advantages of the new API are:

- Removes the need to interact with the underlying AWS CodePipeline library for `Artifacts` and `Sources`
- A streamlined API for sources (more sensible defaults allowing you to specify less)
- `Synth` classes hide less from you, allowing you more control and remove the need to decide whether or not to "eject" from the convenience classes of the original API
- Supports parallel deployments (speeding up large pipelines)
- Supports stages of >25 stacks
- Supports multiple sources powering the build
- Gives more control over the CodeBuild projects that get generated

In addition, by clearly separating out generic parts of the library from CodePipeline/CodeBuild-specific parts, allows easier development of construct libraries that target alternative deployment systems while reusing large parts of the logic of this library.

This does not remove or deprecate the old API, though starting today its use is discouraged in favor of the new API, which will see more development in the future.

Closes aws#10872.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution/core This is a PR that came from AWS. pr-linter/exempt-test The PR linter will not require test changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pipelines] Change to a single-source, single-build pipeline, deploy+approve model
7 participants