Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The recommendation to "move" tags #214

Closed
friederbluemle opened this issue Nov 12, 2019 · 14 comments
Closed

The recommendation to "move" tags #214

friederbluemle opened this issue Nov 12, 2019 · 14 comments
Labels
question Further information is requested

Comments

@friederbluemle
Copy link

This is regarding the official Versioning documentation for authoring custom GitHub Actions.

I really had to read it several times to make sure that an "official" GitHub controlled repository is recommending to "move" tags between commits. 😨

Please, if you haven't already, read the section On Re-tagging in the official Git man page for the tag command. Specifically, "The insane thing" (point 2), and why "Git does not change tags behind users back".

These mechanisms are in place for a reason, and together with many other amazing things about Git, have contributed to the huge success and the way Git/GitHub have revolutionized version control and collaboration over the past decade. Please, don't try to circumvent them and recommend anti-patterns as best practices. This totally sends the wrong signal, especially to new users who are less familiar with Git. Do not force re-tag and force override already published tags.

All that said, most what is written in the Versioning documentation document makes sense, and is following best practices 👍 It seems like a solution that uses Git tags in the intended way with minimal change to what is already there could be:

  • Continue to use the master branch for ongoing development
  • Tag actual releases using standard semver tags (v1.0.0, v1.1.0)
  • Do not use "moving" tags
  • Use a branch v1 instead of releases/v1

A branch is the correct construct to point to different commits over time.

Example

Say v1 points to v1.0.0, and a new version v1.1.0 has been tagged. After successfully testing v1.1.0, v1 could simply be fast-forwarded to v1.1.0. There is no need to force anything, and from a user's perspective, everything remains the same, e.g. using an action would still look like this:

uses: actions/custom-action@v1
@bryanmacfarlane bryanmacfarlane added the question Further information is requested label Nov 13, 2019
@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Nov 13, 2019

Yes, this has had many discussions and even the actions/javascript-action repo has the walk through creating a v1 branch instead of moving a tag.

When we did that, there were other objections that it led to less opportunity to validate and "release" the action with an explicit action. Merging into the release branch led to it immediately being used by customers. As opposed to typically merging (potentially multiple) changes into a release branch, validating the integration and then having an explicit reaction action. The tag move is that. If the solution is a v1 branch, and you're integrating multiple changes from feature branches, it's still possible with an integration branch off of the release branch so it's still possible.

Completely open to change here - however I will note that the tag moving anti pattern covers coding multi user scenarios. From an action distribution branch perspective for actions, it's not an issue from the runner which simply gets a tar gz of where that ref currently points. So I think the reasons for the anti pattern don't apply ("others will have already seen the old tag"). In this case, that's exactly what we want, the last run "saw" the 1.x action and the next run sees what's currently the 1.x version. The runner throws away the action between runs. However, we do acknowledge that it feels icky (force) and perhaps v1 branch is a better approach (and thus why I tried it out on javascript-action).

Hope that helps on background.

@ericsciple and @thboop for interesting conversation.

Let's use this issue for a community discussion. I would love to get other feedback from the community.

@eine
Copy link

eine commented Nov 14, 2019

IMHO, the point is that refs (tags or branches) in a SCM system are not the natural way to distribute the artifacts of a project:

  • Even though many (most?) GitHub Actions are expected to be written in JavaScript, Actions are not JS libraries.
    • Users of the Actions do not need access to readable or modular JS sources.
    • Ideally, Action artifacts can be reduced to three files only (README.md, action.yml and main.js). This is true no matter the complexity of the JS (or TypeScript) project. See, for example, https://github.com/1138-4EB/actions/tree/release/v1.
  • GitHub Releases and GitHub Registry exist precisely because there are better approaches for distributing artifacts.

Artifacts in GitHub Registry cannot be overwritten, and recreating a GitHub release is not as easy as force-pushing a ref. Hence, I believe that all the documentation suggesting the usage of git refs (instead of properly releasing the artifacts) is because of a deficient implementation of semantic versioning parsing in GitHub Actions workflows. See actions/create-release#16.

This is kind of expected, since apparently someone at GitHub decided a deadline for the beta, without being really aware of how disruptive it was to change from v1 to v2 just a few months ago. Engineers are doing their best, but we have to deal with this annoying lack of better planification.

I think that priorities are:

Overall, I think that it is important for new users/developers to understand the difference between sources and artifacts. If that is clear, it is easy to understand that releases/* refs are not regular git tags/branches. Instead, those are just used to create URLs to tarballs.

For example, in https://github.com/actions/javascript-action:

  • Packaging (using a bundler) is recommended (https://github.com/actions/javascript-action#package-for-distribution), which is good.
  • The suggested approach is to add both the sources and the artifacts to a release branch (v1): https://github.com/actions/javascript-action#create-a-release-branch. This defeats the purpose of the previous step: "assembles the code into one file that can be checked in to Git, enabling fast and reliable execution and preventing the need to check in node_modules". I.e., users will download all the sources, even though subdir dist will be used only. As commented in Use a bundler to simplify the deployment numworks/setup-msys2#4, the difference for even the most simple Actions can be significant.
    • Even though dist exists in master, it is not included in any of the tags or the release branch.
  • There is no example workflow to automatically publish the artifacts when a tagged commit is pushed. A natural process would be users to push a tag, and have an action build it, test it and publish it.
    • This is possible, no matter if the target is GitHub Releases, GitHub Registry or a release branch.

For example, in https://github.com/1138-4EB/actions/blob/master/.github/workflows/push.yml a branch named release/vX is created and pushed when tagged commits are pushed. It allows developers to push semver compilant tags and forget about manually force-pushing. Of course, this is still contrary to git best practices. Instead of force-pushing tags, branches are being force-pushed. However, I think it is better for release/vX branches to have a single commit in the history, rather than keeping a broken history.

However, we do acknowledge that it feels icky (force) and perhaps v1 branch is a better approach (and thus why I tried it out on javascript-action).

I think that a safe procedure would be to strongly suggest users to only push full semver tags. Then, it is possible for a GitHub workflow, such as the example above, to create/overwrite either a branch or a tag. With this approach, there is no need to prepend release/, since no vX ref will be created manually.

Moreover, providing https://github.com/1138-4EB/actions/blob/master/.github/workflows/push.yml#L36-L52 as an official Action, would allow the developers at GitHub to easily support other distribution repositories in the future. Developers of actions provide the dir with the artifacts (dist), the tag, the token and the user/repo names. Then, it is up to GitHub engineers to push it to a branch, a tag, a Release or a Registry. I point to GitHub engineers, because such features depend on the parser/runner of YML workflows being able to fetch resources from not git repos only.

Ref actions/checkout#13

Say v1 points to v1.0.0, and a new version v1.1.0 has been tagged. After successfully testing v1.1.0, v1 could simply be fast-forwarded to v1.1.0. There is no need to force anything, and from a user's perspective, everything remains the same

Note that it is not guaranteed that all the tags are located in a linear history. If developers follow a workflow with two main branches (master and develop) as in https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow, there is no need to force anything. However, if a single main branch is used, hotfixes to v1.0 are not guaranteed to be merged back to master. In this context, a forced push of branch v1 would be required to change from v1.0.1 to v1.1.0.

@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Nov 14, 2019

👋 @1138-4eb Thanks for your feedback

I think we should focus this issue on the tag. However, I can at least shed some light on the motivations for some of the decisions.

GitHub Releases and GitHub Registry exist precisely because there are better approaches for distributing artifacts

Yes, there are plans and discussions going on to support GPR as a distribution mechanism. We discussed releases as an option but we're leaning toward GPR.

When we have GPR, it will support full semver. However, we should note that right now, the solution is tagging. A closer analogy would be tagging in dockerhub. e.g. node offers tags like 8.16.2, 8.16, and 8. In that model, 8 means the latest version of 8.

a safe procedure would be to strongly suggest users to only push full semver tags

It's safer for a source, package, build scenarios.

However, the hosted images (and all the dev tools installed on them), the tools distribution mechanisms (dotnet, node installs), the service itself and the runner all move forward. Holding one component of a runtime system back as everything else moves forward provides an illusion of safety but in practice it breaks down. In addition there's critical security fixes and breaks in tools that need reactions.

Azure also defaults to binding to version channels (not exact) and tool breaks happened where changes were needed and millions of builds continued to safely build when the images were updated. Exact binding would have led to build breaks en-masse. Another example would be a subtle change in the way a tool is distributed or its layout on disk (setup-node prepends the path to the bins). Users expect to add 14 to their list of 10 and 12 when 14 is released and they don't need to know when they add 14, they need to bump the action version. However, major version binding allows for action authors to make breaking changes in the inputs and behaviors.

SCM system are not the natural way to distribute the artifacts

The motivation here from product was to actually keep actions as simple dockerfiles and scripts and to use the GitHub graph. It has benefits such as just fork and use. Users can try out an action fix that's in a PR branch for validation etc. Push is the distribution. Yes, it has the consequence of the ts source files next to the js files (if you use ts) but in practice the extra few scripts from the tar ball the runner pulls (it does not clone the repo), is not an issue. The key issue is to avoid pulling all the dev dependencies (which are very large) and ncc (javascript-action example) solves that nicely. Yes, all that's actually needed to run is the action yml and a javascript entry point or dockerfile.

I hope that sheds some light on at least the motivation and reasoning for the decisions that were made. As I pointed out, there's discussions for GPR and all your feedback is appreciated. Thank you!

Let's keep this issue focused on for the current distribution mechanism and it's v1 tag (1.*)

@eine
Copy link

eine commented Nov 14, 2019

@bryanmacfarlane, thanks a lot for the insight!

It's safer for a source, package, build scenarios...

I was not suggesting to stop using major version tags; I just suggested that developers don't push them manually, but an action is used instead. Shall I open a PR in https://github.com/actions/javascript-action based on eine/actions@dcb3159 to discuss it?

The motivation here from product was to actually keep actions as simple dockerfiles and scripts and to use the GitHub graph.

This makes lots of sense. Unfortunately, it seems that most actions are being written in JS/TS, and not as dockerfiles+scripts. I believe we need to adapt to the fact that new generations feel more comfortable with JS than with bash...

Users can try out an action fix that's in a PR branch for validation etc. Push is the distribution.

I think that this is desirable, indeed. I am not suggesting to stop supporting tags/branches. I think that GPR should be supported along with tags/branches. The regular distribution procedure should be GPR, but for development git refs are a must.

Yes, it has the consequence of the ts source files next to the js files (if you use ts) (...) The key issue is to avoid pulling all the dev dependencies (which are very large) and ncc (javascript-action example) solves that nicely. Yes, all that's actually needed to run is the action yml and a javascript entry point or dockerfile.

I think that it is not a requirement to have the JS/TS sources next to the distribution JS files. This happens to whoever that clones https://github.com/actions/javascript-action just because that template is incomplete. The PR I suggest above is precisely to solve this. Half of the work is already done (using ncc).

Let's keep this issue focused on for the current distribution mechanism and it's v1 tag

Please, check my comments about https://github.com/actions/javascript-action above. I'm proposing to use branch release/v1 automatically. It is trivial to adapt it to use tag v1. Even if I mentioned GPR and GitHub Releases, I tried to comment several enhancements (to the docs and to the template) that can be applied now, which assume that git refs are going to be used still.

Regarding the discussion about tags vs branches. I think it deserves some clarification in the docs. As you said, it has had many discussions, and it is likely that it will keep coming up. Since this is a non-standard usage of git, a brief explanation can avoid new users being mislead. If you are ok with the PR, this explanation can be added to https://github.com/actions/javascript-action#create-a-release-branch.

@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Nov 14, 2019

The nice part about the @friederbluemle suggestion and what's done in the javascript-action template is the manual step is removed (move tag). To release you simply merge into the v1 branch. The PR is the release approval.

I was trying it out on the javascript-action first before updating guidance so I'm considering just updating the guidance.

We can take packing, actions and other discussions in other issues.

@ericsciple I think this means the v2 checkout action we're starting should just be coded in a topic branch, then merged to master upon CR and then when we're ready for release, a v2 branch.

@bryanmacfarlane
Copy link
Member

I will work on this today.

@ethomson
Copy link
Contributor

Tags remain useful here, since tags align with GitHub releases, and - most importantly - marketplace publishing is controlled by tags and releases.

Indeed moving tags is not a good practice in a source repository, but that's not what this is, we're using a repository here as a release mechanism. And a poor practice for source trees is not necessarily a poor practice for a release mechanism.

Now you could argue that we shouldn't be using a repository as a release mechanism (since it means checking in build artifacts and using tags in this way). And that's a fair argument - as @bryanmacfarlane mentioned, we're working here to explore other options. In the meantime, I think that we should keep this guidance as-is.

@friederbluemle
Copy link
Author

This is a great discussion. Thanks a lot for your input everyone!

It's interesting to see that actions/javascript-action is already using the branch based approach (I did not know this :)). Also, thanks @bryanmacfarlane for providing more details on the thinking behind the recommendation. The reasons given (making releasing a new version more "explicit" step) are consistent with what I assumed to be the case.

Thanks @1138-4eb for an amazingly detailed, and completely valid assessment of the underlying issue that Git refs (branch/tags) are (mis)used as a way to distribute artifacts from a project. Another thing not mentioned in the original question that struck me as odd was the fact that a comment in .gitignore suggests to temporarily (?) disable ignoring of node_modules/ and committing them to source control!


Note that it is not guaranteed that all the tags are located in a linear history. If developers follow a workflow with two main branches (master and develop) as in https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow, there is no need to force anything. However, if a single main branch is used, hotfixes to v1.0 are not guaranteed to be merged back to master. In this context, a forced push of branch v1 would be required to change from v1.0.1 to v1.1.0.

Yes, that's true and I'm aware of it. My example was simplified. As soon as older major versions needs to be supported concurrently, additional branches (e.g. v1-next, v2-next, ..) would be required. That is to allow testing prior to "release", assuming they don't merge back into the main branch. Unless I'm missing something it's still not necessary to force push branch v1. Because by semver definition, only major versions can break backwards compatibility, it should be relatively safe to assume that within one major release line (v1, v2, ...), we're looking at a linear history. In other words, if v1 branch points to v1.1.0 it would never be forced back to v1.0.1. Instead, the patch would be applied to v1.1.0, and v1 fast forwarded to v1.1.1.


An idea: If GitHub Actions yml config could natively parse version ranges similar to NPM, for example:

  • uses: actions/custom-action@1.0.0
    Resolves to tag v1.0.0 (only)
  • uses: actions/custom-action@~1.0.0
    Resolves to the highest available tag v1.0.0,v1.0.1,v1.0.2,etc (but not v1.1.0 or higher)
  • uses: actions/custom-action@^1.0.0
    Resolves to the highest available tag v1.0.0,v1.1.0,v1.2.0,etc (but not v2.0.0 or higher)

Since the current behavior of @v1 is just like @^1.0.0 it could still be supported for backwards compatibility.

We get the following benefits:

  • No more need to force move tags and overwrite published tags
  • No more need for a releases/v1 branch
  • We still have an "explicit action" to release by creating and pushing a tag

It feels much more natural and in line with the intended usage of Git tags.

In addition each user of an action has full control over how much "risk" they're willing to accept of the CI breaking without changing the code. In an ideal world where everyone strictly follows semantic version this it should never happen.

Of course, this does not address what was said regarding releasing/publishing an artifact directly from a source repo. I wonder if a separate issue should be created specifically to discuss GitHub releases/GPR?

@eine
Copy link

eine commented Nov 15, 2019

To release you simply merge into the v1 branch. The PR is the release approval.

This is also possible without putting sources and artifacts in the same branch.


Another thing not mentioned in the original question that struck me as odd was the fact that a comment in .gitignore suggests to temporarily (?) disable ignoring of node_modules/ and committing them to source control!

The procedure which is recommended in the docs does not only require to temporaly disable it, but it also requires to remove it and npm install or npm ci. Otherwise all the sources in node_modules are added, even if some are not required at all. By the same token, when some developer clones a commit which has node_modules included, the content is likely to be incomplete, because devdependencies should not be installed.

This is my main motivation to suggest:

Edit help.github.com/en/actions/automating-your-workflow-with-github-actions/creating-a-javascript-action#commit-and-push-your-action-to-github so that using a bundler is the suggested approach, and not an alternative.

The most relevant effect of this is precisely to avoid removing node_modules from .gitignore.

As commented, half of the work (using a bundler) is already done in https://github.com/actions/javascript-action. Section https://github.com/actions/javascript-action#create-a-release-branch needs to be fixed only.


Because by semver definition, only major versions can break backwards compatibility, it should be relatively safe to assume that within one major release line (v1, v2, ...), we're looking at a linear history.

You are correct.

An idea: If GitHub Actions yml config could natively parse version ranges similar to NPM, for example:

@ethomson commented in actions/create-release#16 that "something we're looking at soon". I added the reference and comment above just for contextualisation.

However, the discussion whether release commits should contain sources or artifacts only remains. Any possible branching/releasing model is conditioned by that.

@friederbluemle
Copy link
Author

friederbluemle commented Nov 15, 2019

Thanks for the additional clarifications @1138-4eb 👍

Do you think it makes sense to create a separate issue regarding the release/packaging method for artifacts, or do you prefer to keep everything together in this thread (and perhaps reword the issue title)?

@eine
Copy link

eine commented Nov 15, 2019

I think it is worth a separate issue. But I don't know if it should fit in this repo or in https://github.com/actions/javascript-action. That's why I asked about creating a PR to discuss it.

@bryanmacfarlane
Copy link
Member

Full semver will come with the using GPR to release actions. As @ethomson pointed out, the tag based approach is what we have now and needed by the marketplace. I will update javascript-action and create a PR to not create the v1 branch as it won't work with the market place. We also need to update the versioning doc to note the the releases/v1 branch is not needed until we need to break compat and create a v2. ncc offers the ability to not check in any node_modules which alleviates the immediate needs for the releases/v1 branch up front.

I will update those soon and then we can follow up with specific issues or questions on that.

I'm closing this specific issue as the details were laid out and the questions were answered. We should avoid one total encompassing issue on action distribution. It's good to have an issue with a specific topic and agenda.

@eregon
Copy link

eregon commented Feb 7, 2020

I wholly agree with the first post and I think the main point got unfortunately lost.

The documentation could just recommend a v1 branch.
There is nothing preventing that, is it?

That would work fine with the marketplace, since anyway the v1 tag or branch should not be published to the marketplace, only specific versions (e.g., 1.0.1), as already documented.

Having a release branch is a separate matter, one can still have a release/v2 branch or so.

Packaging is another story, but there is clearly value in a beautifully simple way to distribute code with users of an action (i.e., by pushing to a branch, not overriding a tag).

@myitcv
Copy link

myitcv commented Jan 13, 2022

Linking community/community#9847. I don't think branches are the right solution either. The references vN and vN.M are entirely well defined but existence dependent on valid semver versions vN.M.x existing. An approach that sees owners manually create tags/branches of that form is brittle and unnecessary.

(I will also note that in the context of (recent) software supply chain issues, the default recommendation should be to reference a commit, or some equivalent immutable reference to a released action's assets).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants