Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pubsys: make Bottlerocket repos using cargo make repo #964

Merged
merged 1 commit into from
Jul 23, 2020

Conversation

zmrow
Copy link
Contributor

@zmrow zmrow commented Jul 2, 2020

Simplify repo creation and tie it into the existing cargo make build system
using a new repo target. This is intended to replace the common pattern of
calls to tuftool and updata - either creating a repo or extending an existing
one, adding the latest built artifacts, then updating the manifest to match.

Usage:
cargo make repo depends on the build target, so if your goal is building an
update for a repo you don't have to separately build it first. It uses the
same BUILDSYS_VARIANT and BUILDSYS_ARCH variables to determine the metadata
for the update added to the repo.

Requirements:

  • An Infra.toml file, based on Infra.toml.example, listing paths to keys,
    existing repos, etc.
  • Release.toml updated with a new version

Optional further configuration:

  • Repo expiration policy - by default, uses a policy file with 2 week target
    and snapshot expiration and 1 week timestamp expiration.
  • Wave policy - same policy files you give to updata today; defaults to
    "default" wave policy.
  • Release start time - when waves start and when expiration starts counting
    down; defaults to now.
  • Can select which named repo and signing key to use from Infra.toml.

Design decisions:

  • Built repos are written to a directory like
    /repo/bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d that includes variant,
    arch, and full version in the name, so that you can prepare repos for
    multiple releases in parallel if desired. Inside are target/ and metadata/
    directories that can be synced to your final repo location.
  • buildsys uses environment variables set by cargo-make; we opted instead for
    more standard arg parsing. It seems more likely that someone would use
    pubsys separately from cargo-make, and pubsys has more input information, so
    arg parsing was clearer.
  • cargo-make environment variable expansion is done in phases, and you can't
    refer to a variable defined in the same section if you intend to let the user
    override the earlier variable on the command line. If you do, the variable
    won't expand, as seen in
    make BUILDSYS_OUTPUT_DIR work with overrides #963. Because of this,
    until we figure out a better strategy, a couple of variables can't be
    overridden - the path to Release.toml (which we made a variable in this
    change) and the repo output directory.

Testing done:

  • Tested arguments to pubsys, including the optional release start time, and
    the --link-target argument not used by Makefile
  • Tested overriding all environment variables in Makefile.toml
  • Tested with file- and SSM-based keys
  • Tested building a new repo and then extending it, and tested starting from
    live Bottlerocket repo
  • Tested using the repo we built based on the Bottlerocket repo, adding a
    v0.4.1 update, and using it with an instance to successfully update
repo/
└── bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty
    ├── metadata
    │   ├── 1593646164.snapshot.json
    │   ├── 1593646164.targets.json
    │   ├── 1.root.json
    │   └── timestamp.json
    └── targets
        ├── 1c4024695070dcbc4af6660422f08f2b70e04fc309d40da1fe14d8af5935c2e7.bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-root.verity.lz4 -> /home/user/bottlerocket/bottlerocket/build/images/x86_64-aws-k8s-1.15/bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-root.verity.lz4                                      
        ├── 281141246fa284e932c16a015a0420c0d933a517f1ea3ca5cf4cca20a516f592.bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-root.ext4.lz4 -> /home/user/bottlerocket/bottlerocket/build/images/x86_64-aws-k8s-1.15/bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-root.ext4.lz4                                          
        ├── 70d6664cc5d33be9465377404a81ac91123f1aa08e0ea4d282d030482ff1af57.migrate_v0.3.2_admin-container-v0-5-0.lz4
        ├── 7a28a586ba12273372dfbd9d644d6b573c5ba517f10f1a69cae4356b3867872d.manifest.json
        └── 96cf685a8b8292a1a589e9124e0bddbbdad9b12634c252ce257491f078eff257.bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-boot.ext4.lz4 -> /home/user/bottlerocket/bottlerocket/build/images/x86_64-aws-k8s-1.15/bottlerocket-aws-k8s-1.15-x86_64-0.4.1-5880e5d-dirty-boot.ext4.lz4                                          

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@webern webern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🍔

tools/pubsys/Cargo.toml Show resolved Hide resolved
tools/pubsys/src/repo.rs Show resolved Hide resolved
tools/pubsys/src/repo.rs Outdated Show resolved Hide resolved
tools/pubsys/src/repo.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@etungsten etungsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

@zmrow
Copy link
Contributor Author

zmrow commented Jul 7, 2020

Rebased on develop and fixed the Makefile.toml conflict

Makefile.toml Outdated Show resolved Hide resolved
tools/pubsys/Infra.toml.example Show resolved Hide resolved
tools/pubsys/Infra.toml.example Outdated Show resolved Hide resolved
tools/pubsys/Infra.toml.example Outdated Show resolved Hide resolved
Makefile.toml Show resolved Hide resolved
Makefile.toml Show resolved Hide resolved
tools/pubsys/src/config.rs Outdated Show resolved Hide resolved
tools/pubsys/src/repo.rs Show resolved Hide resolved
tools/pubsys/src/repo.rs Outdated Show resolved Hide resolved
tools/pubsys/src/repo.rs Outdated Show resolved Hide resolved
@tjkirch
Copy link
Contributor

tjkirch commented Jul 9, 2020

@zmrow and I addressed Ben's comments locally, but while testing them we realized something about the current structure that gave us pause - pubsys can't use one of its output repos as an input for a following run, because of the directory structure. We need to either decide that it's expected and move on as-is, or come up with a different directory structure that allows for it.

As to why it happens - pubsys writes out to a flat directory structure of /repo/bottlerocket-VARIANT-ARCH-VERSION/{metadata,targets}. This was done so that separate runs have separate results, while keeping them in an obvious location. However, the expected structure for an input repository (what you point to when you want to add to an existing repo) is more nested, METADATA_BASE_URL/VARIANT/ARCH and TARGETS_BASE_URL. This matches the Bottlerocket live repo structure, where you want to share targets rather than duplicate.

One option for a change would be to use the nested structure for pubsys output, too. The main difference from the current revision is that different pubsys runs would copy/link targets into the same directory, /repo/bottlerocket-VERSION/targets. That's different than our current process, which builds repos in entirely separate directories, but I think it's safe because it'd still be under a bottlerocket-VERSION directory (artifacts within a version should be consistent) and variants/arches don't share target names (consistent snapshots cause unique filenames for things like the manifest).

I think the tool can be a little opinionated about structure, rather than needing to have more required variables or a super flexible path system at the moment, so that seems OK to me...

@zmrow
Copy link
Contributor Author

zmrow commented Jul 9, 2020

I think I'd rather not have a unified /targets directory, even under a repo/bottlerocket-VERSION/targets folder. It feels a bit confusing to me and I think it could cause some problems and confusion when syncing the repo from your local machine elsewhere.

I'm wondering how often a user would want to use an outdir as the pubsys indir rather than just creating a new repository. Until we have a good answer to that question or an example of a solid use case, my vote is to leave pubsys as we currently have it. I'd rather get some more experience using it and an additional set of use cases before we make changes.

@bcressey
Copy link
Contributor

I think I'd rather not have a unified /targets directory, even under a repo/bottlerocket-VERSION/targets folder. It feels a bit confusing to me and I think it could cause some problems and confusion when syncing the repo from your local machine elsewhere.

Having the property where input, output, and remote structures match seems better to me. I'd like to be able to run aws s3 sync from a single directory to update metadata + targets for all variants of interest. It's easier to dry run that way; it's easier to step away for a moment and come back and not remember to do the other half of the sync.

We can split it out into two syncs for production use cases where it's important that targets go up before metadata, but otherwise it just seems more convenient.

We could also make it more convenient to use the repositories by modeling the shared prefix (https://updates.bottlerocket.aws/2020-02-02/) as a setting so that it's easy to override just one field in userdata to change both URLs.

@tjkirch
Copy link
Contributor

tjkirch commented Jul 10, 2020

We discussed the output repo structure offline, and decided for now to go with the plan to make the output structure match the input structure. It's clearer to have a single structure, and it's what Bottlerocket is using in production and what we'd recommend to others as well.

@zmrow
Copy link
Contributor Author

zmrow commented Jul 17, 2020

  • Address bcressey's comments above.
  • Per above discussion, make output structure under /repo match the input
    structure, with /variant/arch directories for metadata and a shared targets
    directory. This allows easily building on top of existing local repos.
  • Before, we would fail if Infra.toml contained URLs for the requested repo
    but tough couldn't load a repo from those URLs. Now, with nested repo
    paths, the metadata_base_url can represent multiple repos, for different
    variants and arches. It's reasonable to configure the URLs in Infra.toml
    and add more variants/arches later, so we can't fail if the URLs are already
    set. (We still fail if there's something at the URLs that we can't load.)
  • Add a RepoTransport to accomplish the above.
  • Simplified editor/manifest construction code; we didn't need the nested
    function or other complication with generics because of RepoTransport.
  • Now we fail if we load an existing repo and it doesn't have a manifest.json.
  • Copy/link targets before writing metadata so we don't have invalid metadata
    if targets fail
  • Restructure the new variables we added to more easily accomodate the repo paths.
  • Add /repo/latest symlink pointing to the latest built repo, like
    /repo/bottlerocket-0.4.0-35e0b5a.
  • Add .gitignore entries for the new paths.

Note: before merging, this requires a new tough release to get the changes to
copy/link_targets. We'll need to rebuild Cargo.lock in pubsys to match.

@zmrow
Copy link
Contributor Author

zmrow commented Jul 17, 2020

Rebased and fixed trivial conflicts

.gitignore Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
Makefile.toml Show resolved Hide resolved
Makefile.toml Outdated Show resolved Hide resolved
Makefile.toml Outdated Show resolved Hide resolved
Makefile.toml Outdated Show resolved Hide resolved
@tjkirch
Copy link
Contributor

tjkirch commented Jul 20, 2020

^ This force-push was just a rebase with no conflicts; fixes for comments are coming soon.

@tjkirch
Copy link
Contributor

tjkirch commented Jul 20, 2020

This push addresses most of @bcressey's comments:

  • Move gitignore of tools/bin/pubsys to tools/.gitignore
  • Move /repo to /build/repos
  • Remove gitignore of /repo since it's covered by existing gitignore of /build now
  • Only fetch once for tools workspace
  • Use relative link for latest to make clear it's in the same directory
  • Add clean step for repos
  • Fix variant names for PathExists to match update in Allow control of link/copy behavior for existing paths awslabs/tough#187

@tjkirch
Copy link
Contributor

tjkirch commented Jul 22, 2020

^ This push updates variable names in Makefile.toml with a PUBLISH_ prefix, per above discussion with @bcressey.

@tjkirch tjkirch requested a review from bcressey July 22, 2020 17:34
Simplify repo creation and tie it into the existing `cargo make` build system
using a new `repo` target.  This is intended to replace the common pattern of
calls to tuftool and updata - either creating a repo or extending an existing
one, adding the latest built artifacts, then updating the manifest to match.

**Usage:**
`cargo make repo` depends on the `build` target, so if your goal is building an
update for a repo you don't have to separately build it first.  It uses the
same `BUILDSYS_VARIANT` and `BUILDSYS_ARCH` variables to determine the metadata
for the update added to the repo.

**Requirements:**
* An Infra.toml file, based on Infra.toml.example, listing paths to keys,
  existing repos, etc.
* Release.toml updated with a new version

**Optional further configuration:**
* Repo expiration policy - by default, uses a policy file with 2 week target
  and snapshot expiration and 1 week timestamp expiration.
* Wave policy - same policy files you give to updata today; defaults to
  "default" wave policy.
* Release start time - when waves start and when expiration starts counting
  down; defaults to now.
* Can select which named repo and signing key to use from Infra.toml.

**Design decisions:**
* Built repo metadata is written to a directory like
  /build/repos/bottlerocket-0.4.1-5880e5d/aws-k8s-1.15/x86_64 so that you can
  prepare repos for multiple releases in parallel.  Targets are written to
  a shared directory like /build/repos/bottlerocket-0.4.1-5880e5d/targets -
  they're unique across variants and arches so there's no conflict.  The
  directory structure as a whole can be synced to your final repo
  location; it's the structure expected by Bottlerocket and updog.
* buildsys uses environment variables set by cargo-make; we opted instead for
  more standard arg parsing.  It seems more likely that someone would use
  pubsys separately from cargo-make, and pubsys has more input information, so
  arg parsing was clearer.
* cargo-make environment variable expansion is done in phases, and you can't
  refer to a variable defined in the same section if you intend to let the user
  override the earlier variable on the command line.  If you do, the variable
  won't expand, as seen in
  bottlerocket-os#963.  Because of this,
  until we figure out a better strategy, a couple of variables can't be
  overridden - the path to Release.toml (which we made a variable in this
  change) and the repo output directory.

Co-authored-by: Zac Mrowicki <mrowicki@amazon.com>
Co-authored-by: Tom Kirchner <tjk@amazon.com>
@zmrow
Copy link
Contributor Author

zmrow commented Jul 23, 2020

This force-push updates the Cargo.lock to reflect the new tough version.

@tjkirch tjkirch merged commit 44654b1 into bottlerocket-os:develop Jul 23, 2020
@zmrow zmrow deleted the pubsys branch July 23, 2020 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants