-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage of uncompressed tarballs #541
Comments
A few arguments from an internal discussion:
Saying that, we can't deny the speed improvement when unzipping non compressed tars, so there may be a reason to consider this feature |
+1
You can explicitly follow the git history on this directory to figure out which dependencies were upgraded and when, i.e. ...with quick and easy access to the backup: With shrinkpack, the diffs in GitHub are hyper reflective of the commit message and the actual changes being made. Commiting and pushing the result of a new shrinkpack is a better experience, IMO, than doing the same after a yarn pack, because as mentioned, changes are handled at the package version level, rather than repository version level. So you're only pushing up individual |
@joncursi, we have offline mirror feature that does what you want https://yarnpkg.com/blog/2016/11/24/offline-mirror. |
@bestander very cool, thank you for sharing that blog post. I didn't catch this feature ability by reading the CLI docs. This would be a lovely addition to https://yarnpkg.com/en/docs/cli/config I use shrinkpack local to each project, rather than globally for multiple projects. I would like to do the same with yarn, which would require old tar files to be removed when packages are upgraded. I only care about maintaining the latest working version of the package; if I need to dig up an older package version, it's always there in the git history. But I don't need or want to store it directly in the mirror forever. My use-case is to implement the mirror less-so for offline purposes, and more-so for maintaining a concise list of package backups incase packages are suddenly unpublished from NPM. Risk control. As far as I know, that was largely the intent behind shrinkpack in the first place. Is there a smarter way to automate package removal from the mirror when a new package version is added? Perhaps a config option in
Also, the same issue presents itself when removing a package from use in the repo entirely...
|
@joncursi, this is a bit offtopic of this issue, better come up with an RFC discussion of what is needed. As for the cleanup, it can be a 10 line JS/bash script you can run on the side of yarn until we implement it.
|
This issue is specifically for switching from compressed ( |
From an implementation standpoint, what sort of risks and level of effort would you foresee simply by making this a flag that you can pass to the CLI? Shrinkpack is written so that uncompressed tarballs are the default, but you can opt into compressed packages with a flag. What would the impact be for simply implementing the inverse behavior (opt-in to uncompressed with a flag)? It seems like this would address the issue of potentially unpleasant changes for those already using the offline mirror to commit modules locally, while allowing the uncompressed behavior for those who don't mind aliasing a couple of yarn commands. Edit: Even more simply, the flag could just be defined in the .yarnrc This is actually the main thing preventing us from switching to yarn, as it already admirably solves the determinism issue and the offline mirror feature (thanks for the link, btw!) takes care of the rest. However, it leaves us with the undesirable (from our perspective) situation of committing binary packages. In our experience, Git does very well with simple tar, as most updated packages are recognized as renamed with tiny deltas, and the compression does all the rest. Thus, the actual bandwidth used is dramatically lower. |
Yarn puts the same tarballs that it downloads from the registry into
offline mirror folder.
To allow non compressed tarballs you would need to unzip it first and then
zip it again.
Also the tarballs have versions in file names, so git won't be able to
track version updates as small diffs.
…On Wed, 7 Jun 2017 at 03:40, Brian Frichette ***@***.***> wrote:
From an implementation standpoint, what sort of risks and level of effort
would you foresee simply by making this a flag that you can pass to the
CLI? Shrinkpack is written so that uncompressed tarballs are the default,
but you can opt into compressed packages with a flag. What would the impact
be for simply implementing the inverse behavior (opt-in to uncompressed
with a flag)?
It seems like this would address the issue of potentially unpleasant
changes for those already using the offline mirror to commit modules
locally, while allowing the uncompressed behavior for those who don't mind
aliasing a couple of yarn commands.
This is actually the main thing preventing us from switching to yarn, as
it already admirably solves the determinism issue and the offline mirror
feature (thanks for the link, btw!) takes care of the rest. However, it
leaves us with the undesirable (from our perspective) situation of
committing binary packages. In our experience, Git does very well with
simple tar, as most updated packages are recognized as renamed with tiny
deltas, and the compression does all the rest. Thus, the actual bandwidth
used is dramatically lower.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#541 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACBdWINjiFuNdwuHxPYhFcCAng-KdXK7ks5sBg2HgaJpZM4KQspY>
.
|
You wouldn't need to unzip then zip again, you'd simply need to decompress
the tarball. The inner .tar can stay the same, it'll just not be
compressed.
Not sure about Git, but Mercurial tracks copied files, so it could track
new versions of dependencies as copies of old ones if they're similar
enough.
Sent from my phone.
On Jun 7, 2017 6:27 PM, "Konstantin Raev" <notifications@github.com> wrote:
Yarn puts the same tarballs that it downloads from the registry into
offline mirror folder.
To allow non compressed tarballs you would need to unzip it first and then
zip it again.
Also the tarballs have versions in file names, so git won't be able to
track version updates as small diffs.
On Wed, 7 Jun 2017 at 03:40, Brian Frichette ***@***.***> wrote:
From an implementation standpoint, what sort of risks and level of effort
would you foresee simply by making this a flag that you can pass to the
CLI? Shrinkpack is written so that uncompressed tarballs are the default,
but you can opt into compressed packages with a flag. What would the
impact
be for simply implementing the inverse behavior (opt-in to uncompressed
with a flag)?
It seems like this would address the issue of potentially unpleasant
changes for those already using the offline mirror to commit modules
locally, while allowing the uncompressed behavior for those who don't mind
aliasing a couple of yarn commands.
This is actually the main thing preventing us from switching to yarn, as
it already admirably solves the determinism issue and the offline mirror
feature (thanks for the link, btw!) takes care of the rest. However, it
leaves us with the undesirable (from our perspective) situation of
committing binary packages. In our experience, Git does very well with
simple tar, as most updated packages are recognized as renamed with tiny
deltas, and the compression does all the rest. Thus, the actual bandwidth
used is dramatically lower.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#541 (comment)>, or
mute
ACBdWINjiFuNdwuHxPYhFcCAng-KdXK7ks5sBg2HgaJpZM4KQspY>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#541 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFnHVDswWH9cTWxFh7IrnUB3_s7Q_t2ks5sBl70gaJpZM4KQspY>
.
|
Thanks, Daniel, good to know.
Although someone needs to show that this advanced mercurial/git tracking
would happen on a real example then before we consider this change, right?
On Wed, 7 Jun 2017 at 09:35, Daniel Lo Nigro <notifications@github.com>
wrote:
… You wouldn't need to unzip then zip again, you'd simply need to decompress
the tarball. The inner .tar can stay the same, it'll just not be
compressed.
Not sure about Git, but Mercurial tracks copied files, so it could track
new versions of dependencies as copies of old ones if they're similar
enough.
Sent from my phone.
On Jun 7, 2017 6:27 PM, "Konstantin Raev" ***@***.***>
wrote:
Yarn puts the same tarballs that it downloads from the registry into
offline mirror folder.
To allow non compressed tarballs you would need to unzip it first and then
zip it again.
Also the tarballs have versions in file names, so git won't be able to
track version updates as small diffs.
On Wed, 7 Jun 2017 at 03:40, Brian Frichette ***@***.***>
wrote:
> From an implementation standpoint, what sort of risks and level of effort
> would you foresee simply by making this a flag that you can pass to the
> CLI? Shrinkpack is written so that uncompressed tarballs are the default,
> but you can opt into compressed packages with a flag. What would the
impact
> be for simply implementing the inverse behavior (opt-in to uncompressed
> with a flag)?
>
> It seems like this would address the issue of potentially unpleasant
> changes for those already using the offline mirror to commit modules
> locally, while allowing the uncompressed behavior for those who don't
mind
> aliasing a couple of yarn commands.
>
> This is actually the main thing preventing us from switching to yarn, as
> it already admirably solves the determinism issue and the offline mirror
> feature (thanks for the link, btw!) takes care of the rest. However, it
> leaves us with the undesirable (from our perspective) situation of
> committing binary packages. In our experience, Git does very well with
> simple tar, as most updated packages are recognized as renamed with tiny
> deltas, and the compression does all the rest. Thus, the actual bandwidth
> used is dramatically lower.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#541 (comment)>, or
mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/
ACBdWINjiFuNdwuHxPYhFcCAng-KdXK7ks5sBg2HgaJpZM4KQspY>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#541 (comment)>, or
mute
the thread
<
https://github.com/notifications/unsubscribe-auth/AAFnHVDswWH9cTWxFh7IrnUB3_s7Q_t2ks5sBl70gaJpZM4KQspY
>
.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#541 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACBdWNPXiyEZpnNFpKLpD7FoIXNe6NqYks5sBmDQgaJpZM4KQspY>
.
|
Hi @bestander, we use git with bitbucket & npm + shrinkwrap on some projects. Here is what it looks like when minor version of the tar changes: Here are sample tar files for package from screenshot that was tracked as renamed: Thanks |
I've been meaning to test it out, I just haven't had time to do so. |
Hey there! It's been awhile, and since you're busy, I thought I'd make this as painless as possible. Check out this shrinkpack tar proof of concept |
This seems like a reasonable idea after all. So how would it work?
Results: So if A + B > C + D then why not? |
Bumpity bump! I can work on this if you guys want? |
@bfricka, of course, give it a try. |
Something to consider as a future enhancement, post-launch
Some people may want to store tarballs of all their dependencies in their source control repository, for example if they want a fully repeatable/reproducable build that does not depend on npm's servers. Storing compressed tarballs in Git or Mercurial is generally bad news. Every update to a package would result in a new copy of the entire file in the repo, which can make the repo very large. Every time you clone the repo, the full history is transferred including every previous version of all the packages, so even deleting the binary files has a lasting effect until you rewrite history to kill them.
Instead, we should try storing uncompressed tarballs (ie.
.tar
files). Since the tar files are mostly plain text, in theory Git/Mercurial should be able to more easily diff changes to the files if a new version of a module is added while an old version is removed and just store the delta rather than storing an entirely new blob.Related: This was implemented in Shrinkpack: JamieMason/shrinkpack#40 and JamieMason/shrinkpack@7b2f341#comments. According to the comments on the commit, this actually sped up
npm install
whenshrinkpack
implemented it, as npm no longer needed to decompress the archive every time. This makes sense since you're removing the overhead ofgzip
from the installation time.The text was updated successfully, but these errors were encountered: