Skip to content
This repository has been archived by the owner on Jan 13, 2024. It is now read-only.

Compression with fix for #1191 & #1192 #1200

Merged
merged 9 commits into from
Jun 8, 2021

Conversation

erossignon
Copy link
Contributor

No description provided.

@erossignon erossignon requested a review from jesec May 31, 2021 20:52
@erossignon erossignon marked this pull request as draft May 31, 2021 20:52
@erossignon erossignon force-pushed the Feature/CompressionAgain branch 2 times, most recently from 7d045da to 31e408f Compare May 31, 2021 21:01
@erossignon erossignon marked this pull request as ready for review May 31, 2021 21:58
@erossignon erossignon force-pushed the Feature/CompressionAgain branch from b237f71 to 13d6fc4 Compare June 1, 2021 06:31
@erossignon erossignon marked this pull request as draft June 1, 2021 07:03
@erossignon erossignon force-pushed the Feature/CompressionAgain branch 11 times, most recently from 77d2914 to 3d37021 Compare June 1, 2021 20:44
@erossignon erossignon marked this pull request as ready for review June 1, 2021 21:32
@erossignon erossignon changed the title Feature/compression again Compression with fix for #1191 & #1192 Jun 1, 2021
@erossignon
Copy link
Contributor Author

fixes also #1202

@jesec
Copy link
Contributor

jesec commented Jun 2, 2021

I am doing a line-by-line review. It would take some time.

The change is giant. I think it would make it easier to review if we can split some less related changes to separate commits, and possibly get them merged first.

Additionally, I am thinking about a fd-mapping approach to replace the uncompressExternallyAndOpen.

@erossignon
Copy link
Contributor Author

The pr comes in 3 commit
A big one that reverts the revert of the 3 commits of compression that was already validated and 2 smaller commits that fixes #1191 and #1192.
I suggest to look at each individual commits .

@jesec
Copy link
Contributor

jesec commented Jun 2, 2021

I would like to re-review the compression change. A giant change is not reviewable, and it would be easier to review if we split it into several smaller changes.

We can also get some changes that are not directly related to compression merged first.

@erossignon
Copy link
Contributor Author

erossignon commented Jun 2, 2021

That makes sense to unsquash and reorganise the commit while we are here! Agreed

@jesec
Copy link
Contributor

jesec commented Jun 2, 2021

That makes sense to unsquash and reorganise the commit while we are here! Agreed

Great! Please review #1204 when you have time.

@drothmaler
Copy link

Nice work, I'm really looking forward to the compression feature. I'm using Globalize.js and it requires a lot of CLDR JSON data, that increases the size of the bundle a lot, so compression would come in very handy :-)

I think it would be good to also be able to configure the compression using package.json / x.config.json, like you can do for all the other parameters...

@Hypfer
Copy link
Contributor

Hypfer commented Jun 2, 2021

Unfortunately, the code as it is here now doesn't seem to work with my application.
When I try to request one of the static files served by expressjs, I get one of these messages on stdout

Decompression failed
    at BrotliDecoder.zlibOnError [as onerror] (node:zlib:190:17) {
  errno: -2,
  code: 'ERR_RESERVED',
  expose: false,
  statusCode: 500,
  status: 500
}
incorrect header check
    at Zlib.zlibOnError [as onerror] (node:zlib:190:17) {
  errno: -3,
  code: 'Z_DATA_ERROR',
  expose: false,
  statusCode: 500,
  status: 500
}

I'd provide more logs but there are none

@erossignon
Copy link
Contributor Author

Unfortunately though, I did discover two new issues. I've updated the example repo to feature these.

@Hypfer => fixed now

@ayosec
Copy link

ayosec commented Jun 5, 2021

I tried this patch with mathjax-node-cli and I didn't find any issue.

Something unexpected is how slow is Brotli:

Compression Time to build Final size
None 8 seconds 111 Mib
Gzip 15 seconds 62 Mib
Brotli 142 seconds 58 Mib
Steps to reproduce
  1. Download and build this patch.

    $ docker run --rm -ti node bash
    
    # git clone --depth 1 https://github.com/vercel/pkg.git /tmp/pkg
    
    # cd /tmp/pkg
    
    # git fetch origin refs/pull/1200/head:pr1200
    
    # git checkout pr1200
    
    # git log -1 --pretty=%H
    31c95f8eb86599f06d786c47cb47d7aef5a709c5
    
    # yarn install && yarn build
    ...  
    Done in 3.79s.
  2. Install mathjax-node-cli and build it against the compression methods.

    # npm install -g mathjax-node-cli
    
    # SRC=/usr/local/lib/node_modules/mathjax-node-cli/bin/tex2svg
    
    # time node lib-es5/bin.js -t node16-linux-x64 -o bin $SRC
    > pkg@5.2.1
    
    real        0m8.245s
    user        0m10.345s
    sys         0m1.404s
    
    # time node lib-es5/bin.js -C Gzip -t node16-linux-x64 -o bin-gz $SRC
    > pkg@5.2.1
    compression:  GZip
    
    real        0m14.755s
    user        0m17.924s
    sys         0m1.698s
    
    # time node lib-es5/bin.js -C Brotli -t node16-linux-x64 -o bin-br $SRC
    > pkg@5.2.1
    compression:  Brotli
    
    real        2m22.321s
    user        2m25.101s
    sys         0m2.064s
    
    # ls -1sh bin*
    111M bin
     58M bin-br
     62M bin-gz
  3. Test the generated binaries.

    # time ./bin '\frac{2}{x}' > 1
    
    real        0m0.707s
    user        0m1.225s
    sys         0m0.041s
    
    # time ./bin-gz '\frac{2}{x}' > 2
    
    real        0m0.852s
    user        0m1.355s
    sys         0m0.079s
    
    # time ./bin-br '\frac{2}{x}' > 3
    
    real        0m0.881s
    user        0m1.387s
    sys         0m0.056s
    
    # cmp 1 2 && cmp 2 3 && echo OK
    OK
    
    # head -c 50 1
    <svg xmlns:xlink="http://www.w3.org/1999/xlink" wi

Also, please consider adding LZ4. Since most of the compressed data are JavaScript files, with maximum compression level it is close to Gzip, but much faster.

Example:

# find -type f -name '*js' -exec cat {} + > /tmp/big.js

# ls -sh /tmp/big.js
90M /tmp/big.js

# time lz4 -9 < /tmp/big.js | wc -c | numfmt --to=iec
20M

real	0m2.814s
user	0m2.787s
sys	0m0.026s

# time gzip -9 < /tmp/big.js | wc -c | numfmt --to=iec
17M

real	0m5.421s
user	0m5.380s
sys	0m0.047s

@erossignon
Copy link
Contributor Author

@ayosec LZ4 is not built-in nodejs and would be difficult to implement. Feel free to propose a PR if you are inspired.

@Hypfer
Copy link
Contributor

Hypfer commented Jun 6, 2021

@erossignon Can confirm. Thank you!

Q: Since pkg is now using os.tmpdir() which often is a ramdisk to uncompress all/some(?) files, is there any rough estimate regarding memory overhead of using compression?

@erossignon
Copy link
Contributor Author

erossignon commented Jun 6, 2021

Q: Since pkg is now using os.tmpdir() which often is a ramdisk to uncompress all/some(?) files, is there any rough estimate regarding memory overhead of using compression?

  • a/ files that are accessed with fs.readFile or fs.readFileSync are uncompressed on the fly, in memory. This is the case for all embedded javascript script files.

  • b/ files that are accessed with fs.open/fs.openSync & fs.read/fs.readSync needs to be deflated in a temporary folder beforehand as they can potentially read by random access.

  • c/ files that are accessed with fs.createReadStream used to be uncompressed on the fly, in memory in my first attempt, At this time , fs.createReadStream fallback to the fs.openSync & fs.read scenario and deflated in the tempfolder.

  • d/ binary modules are always copied to a physical disc (regardless of compress flag)

  • scenario b/ & c/ are likely used when the application tries to access 'assets' files.

Note on c:
In my first attempt, createReadStream did uncompress on the fly, in memory. but the mechanism was brittle and convoluted.
On top of this, it did work when createReadStream was passed a non-zero start position and some variation of the internal implementation exists between versions of the node. I have another idea to improve this but I'll do this in a different PR as this one is quite big now.

  • if your packed application doesn't contain asset files, nor binary modules then the temp folder is likely not to be used.

test/test-12-compression/.gitignore Outdated Show resolved Hide resolved
test/test.js Outdated Show resolved Hide resolved
test/test-79-npm/homebridge/homebridge.js Outdated Show resolved Hide resolved
lib/packer.ts Outdated Show resolved Hide resolved
lib/index.ts Outdated Show resolved Hide resolved
prelude/bootstrap.js Show resolved Hide resolved
prelude/bootstrap.js Show resolved Hide resolved
prelude/.prettierrc.yml Outdated Show resolved Hide resolved
prelude/bootstrap.js Outdated Show resolved Hide resolved
prelude/bootstrap.js Show resolved Hide resolved
@erossignon erossignon force-pushed the Feature/CompressionAgain branch 7 times, most recently from b33efb7 to 8e4b63b Compare June 7, 2021 20:15
.eslintrc Outdated Show resolved Hide resolved
@erossignon erossignon force-pushed the Feature/CompressionAgain branch from 8e4b63b to 4c98af3 Compare June 8, 2021 05:47
@erossignon erossignon requested a review from jesec June 8, 2021 05:52
.eslintrc Outdated Show resolved Hide resolved
@erossignon erossignon force-pushed the Feature/CompressionAgain branch from b044246 to aecd42d Compare June 8, 2021 06:26
@jesec jesec merged commit cd5ce74 into vercel:master Jun 8, 2021
@jesec
Copy link
Contributor

jesec commented Jun 8, 2021

@leerob We can release v5.2.2.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants