Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 32-bit Windows releases so they are the correct architecture #1475

Merged
merged 16 commits into from
Jul 29, 2024

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Jul 29, 2024

This fixes #1472 so that the 32-bit Windows releases contain 32-bit binaries rather than 64-bit binaries, makes it so pushing a version tag automatically triggers the release workflow, and somewhat streamlines and clarifies the workflow.

Further improvement is possible, including by incorporating some of the other recent changes from the ripgrep release workflow on which it was based. This is something I would be interested to do separately but that I think is not naturally within the scope of these changes.

At least in principle, the workflow can also at this point be extended easily for more build targets, so long as they are otherwise able to build and so long as either (a) they can be built with cross-compilation using cross on a Linux-based host or (b) they do not require cross. However, if adding more Linux-based targets, it would be a good idea to generalize the strip logic, or the resulting binaries would have debug symbols even though they would be optimized builds.

There are some more details in the commit messages. This includes some approaches I tried but did not keep, when they were entangled with the other changes. This information about the journey may be useful because the actual workflow runs I did for testing are not currently public. As discussed in #1472, I used a private reupload to avoid notifying people in their feeds of something the might misunderstand as an actual prerelease of gitoxide. However, I'd be pleased to republish from my public fork with distinctive tag names (e.g., with do-not-use in them), provide you access to the private repository, make the private repository at least temporarily public, or any combination of those things.

For verification, and to demonstrate that the bug does not occur when the release made by the revised workflow is used, I ran these commands on Windows, which correspond to those in "Demonstration checking binaries in all archives of a release" in #1472, except that they will not currently work for anyone but me because they refer to a private repository to which no one else currently has access:

mkdir tmp
cd tmp
gh release -R EliahKagan/private-gitoxide download v0.38.0-alpha.5
gci | %{ 7z x $_ }
gci *.tar | %{ 7z x $_ }
file */gix* */ein*
file */gix* */ein* | sls -raw i686

The last four commands are the same. The full output is in this second file in the related gist. The output just of the last command is:

gitoxide-lean-v0.38.0-alpha.5-i686-pc-windows-msvc/gix.exe:        PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-max-pure-v0.38.0-alpha.5-i686-pc-windows-msvc/gix.exe:    PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-max-v0.38.0-alpha.5-i686-pc-windows-msvc/gix.exe:         PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-small-v0.38.0-alpha.5-i686-pc-windows-msvc/gix.exe:       PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-lean-v0.38.0-alpha.5-i686-pc-windows-msvc/ein.exe:        PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-max-pure-v0.38.0-alpha.5-i686-pc-windows-msvc/ein.exe:    PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-max-v0.38.0-alpha.5-i686-pc-windows-msvc/ein.exe:         PE32 executable (console) Intel 80386, for MS Windows, 4 sections
gitoxide-small-v0.38.0-alpha.5-i686-pc-windows-msvc/ein.exe:       PE32 executable (console) Intel 80386, for MS Windows, 4 sections

Note that these are now 32-bit executables. The other Windows builds remain 64-bit.

Edit: I have also manually tested the binaries from the gitoxide-max-v0.38.0-alpha.5-i686-pc-windows-msvc.zip test release to ensure that they run on a 32-bit Windows system and that I can use the gix executable to clone the gitoxide repository via SSH.

The way environment variables are being set and accessed assumes
a Bourne-style (POSIX-like) shell, but PowerShell was being used on
Windows for some steps where this was being assumed.

It's possible to set bash only for some steps, but that is more
complex to understand than using it for everything.
This removes the `EXE_NAME` variable in the release workflow,
replacing all uses of it with an literal `ein`, which was always
its value.

The rationale is that this variable was confusing in two ways:

- The release workflow builds and publishes two executables, for
  both the ein and gix commands. The executable for the ein
  command was specified through this environment variable, while
  the executable for the gix command was already given literally.
  Besides improving consistency, removing the environment variable
  makes clear what is actually being specified.

- The name of the executable that provides the ein command is `ein`
  on most operating systems but `ein.exe` on Windows. But the
  `EXE_NAME` variable always held the value `ein` on all systems,
  with a `.exe` suffix concatenated to it on Windows. This was
  unintuitive because `EXE_NAME` didn't always name an executable.
Changing the shell to bash *may* have been sufficient to cause
these directories to be used by successfully allowing environment
variables to be set and accessed using POSIX shell syntax. If so,
then this change together with that one *may* be enough either to
make it so that the 32-bit release archives really get 32-bit
executables on Windows or to reveal further reasons they do not.
The release.yml workflow adapts from the corresponding workflow in
the ripgrep project, but drawing from multiple revisions of it and
also including substantial gitoxide-specific changes.

ripgrep's licensing permits this even with respect to any material
that is original to that project, since one of the licenses ripgrep
is offered under is the Unlicense, which allows arbitrary use and
adaptation without requiring even attribution.

This commit makes these changes related to that relationship:

- Note the relationship, linking to the ripgrep workflow.

- Bring in the new ripgrep comment explaining the workflow, which,
  as there, is shorter and just above the `create-release` job, not
  at the top. This makes a slight change, so as not to refer to a
  version tag scheme that differs and that we do not enforce.

- Renames both *_VERSION variables to just `VERSION`, as they now
  are in the ripgrep workflow. These variables have conceptually
  the same meaning, always take the same value, and do not clash
  because each job in the workflow has exactly one of them.

- Bring in the simplification of using `github.ref_name` to get the
  tag name to use as the version string.

There remain other significant changes to that workflow that can
and should be applied or adapted into this one, but which are not
included in this commit.

This commit also makes some other changes for simplification and
stylistic clarity:

- Change the trigger tag pattern to the glob-like `v*`, which seems
  to be what the GHA docs say will make it work (though this may
  have been tried unsuccessfully before).

- Don't specify the fetch depth, because `actions-checkout`
  defaults to a depth of 1 already.

- Remove ad-hoc echos that inadvertently performed empty parameter
  expansion because they used a variable that had just been written
  in `$GITHUB_ENV`. Writes to `$GITHUB_ENV` only affect the
  environment of subsequent steps, not of subsequent commands run
  as part of the same script step. These could have been shown in a
  different way, or by expanding them straightforwardly but in a
  subsequent script step (like the existing "Show command used for
  Cargo" step), but the removed echos did not seem very important
  and the information they were showing is mostly available by
  detailed log inspection.

- Simplify some commands whose complexity was just to support those
  ineffective echos.

- Use the bash `$(< path)` syntax where applicable. This is a more
  efficient alternative to `$(cat path)`. It's a bash-ism, but bash
  is now being used to run all steps on all platforms.

- Where both (a) quoting was clearly unnecessary because of the way
  YAML parses strings and (b) quoting was being used in some places
  and not analogous other places, so the style was inconsistent,
  this removes the quoting. (This does not remove quoting in other
  situations, not even when one but not the other condition holds.)

- When quoting inline, this always uses the strongest form, of
  those that are readily available and easily readable. This
  applies both to the formation of YAML strings, which now prefer
  single quotes to double quotes, and to quotation marks that
  appear inside YAML strings that are executed by the shell. Now
  double quotes are only used when both (a) quoting for a shell
  rather than the YAML parser and (b) deliberately causing the
  shell to perform its own `$...` expansions.

  This is because double-quoted `${{ }}` interpolation can falsely
  appear to be a shell expansion, when really all `${{ }}`
  interpolation is done after YAML parsing but before the shell
  receives the code that it runs in a script step.

- Double-quote `$GITHUB_ENV` (rather than not quoting it), so all
  shell parameter expansion is double quoted except where expansion
  that quoting would suppress is wanted (currently, nowhere).

- When shell quoting is optional because the item being `${{ }}`
  interpolated is guaranteed to be parsed by the shell as a single
  argument, and where we *rely* on that effect, this quotes the
  argument. This is done more for consistency, in view of how it
  was already mostly being done, than robustness. It is still not
  truly robust because, although we never set such values,  a `'`
  character, if present in the output of interpolation, would be
  (like all output of `${{ }}` interpolation) present literally in
  the script that the shell receives, and the shell would interpret
  it as a closing quotation mark.

- Adjust spacing for consistency with the existing style.
This adds the `if:` key from the ripgrep workflow to the
"Use Cross" step, so that we only use `cross` instead of `cargo`
when both of the following hold:

(a) We are on a Linux runner.
(b) It is possible that we are cross-compiling.

Of those conditions, (a) is more important than (b), since `cross`
can be used even when not cross-compiling. However, on Windows, it
fails to create the environment, because there is no Docker image
available for it, and also it is a Windows environment so Windows
containers would have to be used, which are not available. This
would also not currently be able to work on Windows hosted GHA
runners, because they don't support nested virtualization.

Although using `cross` on Windows in this workflow had never worked
properly, there were two different incorrect behaviors:

- It was previously not actually being used at all, because
  PowerShell rather than bash was being used as the shell, and the
  attempt to use syntax like `>> $GITHUB_ENV` in PowerShell did not
  succeed at setting the `CARGO` environment variable (nor others).

  Although this is *conceptually* related to the bug where the
  32-bit (i686) Windows release archives actually contained 64-bit
  binaries (GitoxideLabs#1472), it does not seem to have been the direct cause,
  since cross-compiling with `cargo` should have worked. But the
  related failure to set the `$TARGET_FLAGS` and `$TARGET_DIR`
  environment variables was an important factor.
  This was one factor responsible for the bug where the 32-bit

- With bash used for all script steps on all operating systems, it
  failed early with an error from `docker` about not finding an
  image on ghcr.io corresponding to the target.

For some further details, see:
GitoxideLabs#1472 (comment)
This adds that step from the revised ripgrep workflow, moving the
commands into it. The new step is the same as in ripgrep except:

- Without `shell: bash`, because we set that for the whole job
  and do so on all operating systems.

- With a different quoting convention.

The related change of taking the executables from the specific
target directory has already been made. This is part of what is
needed to make that actually work.
All our jobs have matrix.target set, so that check isn't needed.
The intent was to avoid using `cross` when not cross-compiling, but
treating an empty value to mean native compilation wouldn't have
had any effect.

This also does some preparation for related forthcoming changes,
moving the job-level `env` below `strategy` (so it remains easy to
understand even once it gains matrix-influenced values) and fixing
up some wording to reflect that we don't currently build for any
bigendian architectures such as s390x.
Since they are produced via `${{ }}` interpolation using values
from the matrix.

Matrix values cannot be formed from job-level `env`, but job-level
`env` can be formed form matrix values, as this does.
Because `cargo` can also use this environment variable.
This uses parameter expansion, and sometimes also brace expansion,
to make some commands that use environment variables shorter and
easier to read.

I did not apply brace expansion in the strip command run in docker,
since the shell running that may be sh rather than bash.
Unlike many utilities, the `strip` utility is not required by
POSIX to follow the XBD 12.2 Utility Syntax Guidelines, so need not
support `--`. On macOS, and possibly some other systems, it does
not, and `--` causes an error. Fortunately, it's okay to just not
pass it, because the value of `$TARGET_DIR` is known not to start
with `-`.
Because technically even the "non-cross" compilation is cross
compilation, as it is targeting `x86_64-unknown-linux-musl`.

When `cargo` rather than `cross` is used in that build, `ring`
fails to compile C code due to the `musl-gcc` command not being
present. Very likely other dependencies would also fail without
more tools installed on the system, and future features we wish to
support in binary releases would run into this in further ways.

An alternative approach here could be installing `musl-gcc` in a
prior step, much as we could also install an ARM gcc suite and
probably eliminate `cross` for linux-arm. For now, though, I'm
putting this back to using `cross` for all Linux builds.
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for digging in for a fix, it's much appreciated, particularly because I really don't like working on this 😅.

Further improvement is possible, including by incorporating some of the other recent changes from the ripgrep release workflow on which it was based. This is something I would be interested to do separately but that I think is not naturally within the scope of these changes.

I can only encourage to keep the flow going if you think it makes sense now, and if it interests you. Otherwise, it's probably just as good to keep it stable now and hope it doesn't break for a long time.

At least in principle, the workflow can also at this point be extended easily for more build targets, so long as they are otherwise able to build and so long as either (a) they can be built with cross-compilation using cross on a Linux-based host or (b) they do not require cross. However, if adding more Linux-based targets, it would be a good idea to generalize the strip logic, or the resulting binaries would have debug symbols even though they would be optimized builds.

By now, it's possible to specify strip = true in the Cargo.toml manifest directly, so cargo will perform the stripping of release binaries. Thus far I refrained from doing that because when profiling release builds, symbols are a necessity. However, maybe it's possible to define a specific profile configuration that performs no stripping but otherwise is equal to the release build which does perform stripping. If that's possible, I think one could just use cargo to strip symbols which should be cross-platform.
But when looking at the workflow, I could imagine that within cross it wouldn't work after all as it would have to be able to find strangely named platform-dependent strip-binaries.

And I a particularly looking forward to seeing the auto-release work again. Now that I see the regex change I wonder how I could ever think that v.* should work at all, oh, right, I must have thought it's full regex, even though it's only globs.

In any case, I will be looking forward to seeing this workflow run automatically the next time a release happens :).

.github/workflows/release.yml Show resolved Hide resolved
@Byron Byron merged commit ca5de4a into GitoxideLabs:main Jul 29, 2024
19 checks passed
@EliahKagan EliahKagan deleted the w32 branch July 29, 2024 08:08
@Skgland
Copy link

Skgland commented Jul 29, 2024

By now, it's possible to specify strip = true in the Cargo.toml manifest directly, so cargo will perform the stripping of release binaries. Thus far I refrained from doing that because when profiling release builds, symbols are a necessity. However, maybe it's possible to define a specific profile configuration that performs no stripping but otherwise is equal to the release build which does perform stripping. If that's possible, I think one could just use cargo to strip symbols which should be cross-platform. But when looking at the workflow, I could imagine that within cross it wouldn't work after all as it would have to be able to find strangely named platform-dependent strip-binaries.

That sounds like you are looking for a custom profile inheriting from release and overriding strip https://doc.rust-lang.org/cargo/reference/profiles.html#custom-profiles

@Byron
Copy link
Member

Byron commented Jul 29, 2024

Wow, thanks, that's pretty much spot-on! I will leave it to @EliahKagan to implement these (e.g. add a profile profile), as I am afraid that the strip option doesn't work if it has to invoke external programs. But maybe the compiler simply emits no names for functions then instead of stripping them out in an extra step. But definitely something to validate I suppose, despite learning towards trusting the rustc implementation always works.

@Skgland
Copy link

Skgland commented Jul 29, 2024

The strip setting is documented at https://doc.rust-lang.org/cargo/reference/profiles.html#strip and corresponds to the -C strip=val compiler flag documented at https://doc.rust-lang.org/rustc/codegen-options/index.html#strip.
The way that is worded sounds to me like rustc instructs the linker to strip the info while linking rather than invoking a separate strip tool. Though based on the tracking issue for -c strip=val mac OS is special as the linker doesn't support a flag to instruct it to stripping while linking.

@Byron
Copy link
Member

Byron commented Jul 29, 2024

Thanks for sharing your research - it truly looks like this would work on all platforms. MacOS being special shouldn't be a problem as no crosscompilation is going on there. Also, I'd think they will be able to handle differences between MacOS versions better than us in a bash script.

Great, maybe @EliahKagan can use the momentum and adjust the publishing so that stripping is left to cargo itself. I'd think that I can use a release-profile named profiling to get an unstripped release binary for when a profiling run is desired.

EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Aug 10, 2024
- Trigger `msrv.yml` on the push event, for the same branches
  for which `ci.yml` is triggered on it. That is, make the
  corresponding change to the MSRV workflow that was made to the
  test workflow.

- Remove outdated comments on `msrv.yml`. These comments already
  disagreed with the code, since they named `master` when the
  branch was `main`, and suggested that all pull requests would
  trigger the workflow when really was (and continues to be) also
  limited by branch.

- Add `workflow_dispatch` to the `ci.yml` and `msrv.tml` workflows,
  so they can be manually run even for branches that are not named,
  and in the case of `ci.yml`, so it can be manually run even when
  changes are not made to any of the paths that are also needed
  to trigger it on the push or pull_request events.

- Put `workflow_dispatch` last in the `release.yml` workflow that
  already had it. Currently it is a secondary way to trigger all
  workflows that include it. It was in practice the primary way to
  trigger `release.yml` when the push trigger was broken, but that
  is no longer the case since 286e388 (GitoxideLabs#1475). Even in testing, it
  is now most often run by pushing a tag. This brings its style in
  line with the style in `cron.yml`, where `workflow_dispatch` was
  already a secondary way to trigger the workflow and listed second.
LuaKT pushed a commit to LMonitor/gitoxide that referenced this pull request Aug 20, 2024
- Trigger `msrv.yml` on the push event, for the same branches
  for which `ci.yml` is triggered on it. That is, make the
  corresponding change to the MSRV workflow that was made to the
  test workflow.

- Remove outdated comments on `msrv.yml`. These comments already
  disagreed with the code, since they named `master` when the
  branch was `main`, and suggested that all pull requests would
  trigger the workflow when really was (and continues to be) also
  limited by branch.

- Add `workflow_dispatch` to the `ci.yml` and `msrv.tml` workflows,
  so they can be manually run even for branches that are not named,
  and in the case of `ci.yml`, so it can be manually run even when
  changes are not made to any of the paths that are also needed
  to trigger it on the push or pull_request events.

- Put `workflow_dispatch` last in the `release.yml` workflow that
  already had it. Currently it is a secondary way to trigger all
  workflows that include it. It was in practice the primary way to
  trigger `release.yml` when the push trigger was broken, but that
  is no longer the case since 286e388 (GitoxideLabs#1475). Even in testing, it
  is now most often run by pushing a tag. This brings its style in
  line with the style in `cron.yml`, where `workflow_dispatch` was
  already a secondary way to trigger the workflow and listed second.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The 32-bit Windows releases are 64-bit
3 participants