Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: xz compression option #571

Merged
merged 1 commit into from
Jul 20, 2019

Conversation

DeeDeeG
Copy link
Contributor

@DeeDeeG DeeDeeG commented Jun 19, 2019

Pull Request Template:

Describe what you did

Add the option to download a .tar.xz tarball instead of a .tar.gz tarball

How you did it

  • in bin/n, check for an environment variable EXTENSION.

  • If EXTENSION is set, use this in tarball_url() to set the URL to end in (effectively) .tar."$EXTENSION" rather than .tar.gz...

  • AND also use this in install() to set the tar extraction flags to -Jx (for extracting an xz-compressed archive) if $EXTENSION evaluates to xz

Also changed an error message to now alert the user if they have given an invalid extension.

By default, if the environment variable EXTENSION is unset, the script will set it to "gz" to use gzip compression.

How to verify it doesn't effect the functionality of n

n 10 (or similar)...

(with and without EXTENSION environment variable set to xz)

If this solves an issue, please put issue in PR notes.

Nope, I went directly to a PR with this.

If this solves an issue, please include the output of issue that had problems and then the fixed output from the same command.

N/A

Squash any unnecessary commits to keep history clean as possible

  • [x]

Place description for the changelog in PR so we can tally all changes for any future release

  • Add option to download .tar.xz instead of .tar.gz, via environment variable "EXTENSION"

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 19, 2019

I personally prefer downloading xz to save on network bandwidth, so that's why I made this PR. In case it's useful for other people as well.

I haven't squashed this yet, but if this will be merged, I wanted some feedback on whether the new error messages are needed, or if any of the changes to the script should be re-formatted/cleaned up somehow.

Happy to squash and force-push when it's ready to merge.

Also, I put together an alternate implementation of the error messages here: https://github.com/DeeDeeG/n/blob/5ca0a2a8ffa57b5bc06e1c82988dbd129d2feec5/bin/n#L564-L567

That branch has two distinct errors: One if the extension is totally invalid (not present in the "latest" directory) or just invalid for the version you asked for (0.8 and 0.9 didn't ship with xz-compressed tarballs, only gzip-compressed ones).

Let me know if you prefer that, and I can force-push that to the branch for this PR.

Thanks for any feedback on this.

@shadowspawn
Copy link
Collaborator

Reference: https://en.wikipedia.org/wiki/Xz

No mention of xz in issues prior to now, but it is smaller. I looked at nvm and nvs which both default to xz if they detect it is available, using different mechanisms for detection.

Still thinking about preferred behaviour... Automatic is nice so everyone gets benefit but means we need to get it right!

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 20, 2019

FWIW xz is known as a slower algorithm vs. gzip. But I don't think these tarballs are big enough to make it noticeable. I think (for a machine that comfortably runs node, npm, etc.) we can expect that decompressing these tarballs takes less than a second, regardless of whether we use xz or gzip.

Theoretically there is a trade-off, but practically I think xz is an improvement for little to no cost.

The other thing to think about, IMO, is if all platforms that have a gzip decompressor also have an xz decompressor. On debian/Ubuntu this is xz-utils. It's installed by default even on some slim/minimal debian images I have used (from Docker). I have no idea about Alpine, Arch, etc. or macOS.

Seems like a good idea to try to deduce xz support. Perhaps run a tar -Jx, and detect if that fails to infer xz decompression support.

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 20, 2019

We already have is_ok() to check if a given URL returns HTTP 200 (i.e. the resource at that URL exists on the server). I would re-use that to check if [path]/[tarball].xz exists.

EDIT: But if a better way can be learned from nvm, nvs, etc then I don't object to that.

@shadowspawn
Copy link
Collaborator

shadowspawn commented Jun 22, 2019

Eventually, I would like to make the use of xz automatic. And I didn't have it on my wish list until you raised it, thanks. But that is a riskier change and there other things I want to do first (especially improve error handling, and switch to using index.tab), so I don't want to look at automatic xz yet.

So what about opt-in xz now, like this Pull Request? Well, maybe!

  1. I think it is more appropriate to specifically support xz than to allow specifying an arbitrary extension when there aren't any others.
  2. I think it is important longterm to use environment variables names that are less likely to conflict with other packages. (Sadly, you don't get any examples in n but you can see I did that in nvh!)
  3. I don't want to change the program flow or the error messages as part of xz. That is a whole separate challenge.

So if you want to rework your Pull Request to test for (say) N_USE_XZ and switch the tarball extension and tar flags based on that, I'll take another look?

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 22, 2019

N_USE_XZ

sounds good. I did think EXTENSION was a bit too vague and general, but nothing else came to mind and I wanted to get the idea down first and refine later.

@DeeDeeG DeeDeeG force-pushed the xz-compression-option branch from 8423228 to e7c4cdb Compare June 23, 2019 13:51
To download an xz-compressed tarball, set the environment variable
"N_USE_XZ" to be non-empty before running n.
@DeeDeeG DeeDeeG force-pushed the xz-compression-option branch from e7c4cdb to 41400d6 Compare June 23, 2019 14:07
@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 23, 2019

A quick note: Without more-sophisticated error mechanisms, this will simply give

Error: invalid version '0.10.41'

for everything below (earlier than) node 0.10.42 when using xz. (Whereas n still supports all the way back to 0.8.6 if using the default gzip-compressed tarballs.)

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 23, 2019

I made two versions of this feature to respond to feedback. Both have restored the error messages to be the same as before this PR. And both versions use the environment variable N_USE_XZ instead of EXTENSION.

  • The first implementation I have made is already pushed to this PR's branch. It tries to mimic the style of surrounding lines in the script. However, it arguably still changes the program flow a bit.

  • The second (simpler/minimal diff) implementation I made for this feature is here: DeeDeeG@07d57b9

It uses two if-then-else constructions to preserve the exact lines that would be run from before this PR if n_USE_XZ is unset. Other than putting two existing lines inside of if-then-elses, program flow is un-altered with this implementation.

I am open to feedback on either approach.

@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Jun 25, 2019

I suppose the tar flag could be reduced to simply -x ("extract mode"). This way tar doesn't care the compression format of the archive you give it; It will extract regardless.

Pros:

  • Simpler implementation for n
  • No need for branching paths in install()

Cons:

  • Trusts tarball_url() slightly more to pick a reasonable tarball (doesn't double-check that we have the expected tarball compression format)

Neutral:

  • Uploaders at nodejs.org could trick us into extracting a tarball with a given extension (e.g. .tar.gz) that had actually been compressed with another algorithm (eg actually an xz-compressed tarball). Not sure if/why this would be a problem?

@shadowspawn
Copy link
Collaborator

I prefer first implementation (in this Pull Request). Thanks for preparing two implementations. :-)

I wondered about plain -x. I think for backwards compatibility, safest to stick with symmetrical -J for now. I did a bit of reading and didn't find good coverage on how long ago auto -x got added or how well supported it is compared with -J. I was slightly alarmed that -J not documented for tar on macOS X, but it works at least on Mojave!

@shadowspawn
Copy link
Collaborator

(I laughed when I saw you added 👀 to my comment. I have been doing that as a quiet way of indicating I saw someones comment, and you are the first person to do it back. 👍)

@shadowspawn shadowspawn merged commit fe5f808 into tj:develop Jul 20, 2019
@shadowspawn
Copy link
Collaborator

Shipped in v5.0.0

Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants