Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zlib: inconsistent call to flush callback on oversized buffers #3782

Closed
jasnell opened this issue Nov 11, 2015 · 9 comments
Closed

zlib: inconsistent call to flush callback on oversized buffers #3782

jasnell opened this issue Nov 11, 2015 · 9 comments
Assignees
Labels
doc Issues and PRs related to the documentations. good first issue Issues that are suitable for first-time contributors. zlib Issues and PRs related to the zlib subsystem.

Comments

@jasnell
Copy link
Member

jasnell commented Nov 11, 2015

When using an oversized buffer, the flush callback is only sometimes called.

'use strict';
const zlib = require('zlib');

const buf = new Buffer(100000);
const def = zlib.createDeflate({
  highWaterMark:5,
  level: 2
});
def.on('drain', ()=> console.log('drained'));
def.write(buf, ()=> {
  console.log('after write');
  def.flush(zlib.Z_FULL_FLUSH, function(err) {
    console.log('flushed');
  });
  def.end();
});
bash-3.2$ ./node ~/test.js
drained
after write
bash-3.2$ ./node ~/test.js
drained
after write
flushed
bash-3.2$ 

Relevant IRC Chat

15:46 jasnell: not yet... the inconsistency seems to have something to do with the size of the buffer
15:46 jasnell: if I drop down to Buffer(10000), it works fine
15:46 chrisdickinson: that's < zlib window size
15:47 chrisdickinson: it might be in limbo waiting for a corresponding read
15:47 thealphanerd: adding a read does give a consisent flush
15:48 jasnell: ok, makes for a rather tricky inconsistency
15:48 jasnell: sometimes it works, sometimes it doesn't, with no indication as to why
15:50 thealphanerd: jasnell: I think you can check to see if write returns false?
15:51 chrisdickinson: ah
15:51 chrisdickinson: might see how many times the handle is written to per-buffer
15:51 chrisdickinson: https://github.com/nodejs/node/blob/v4.2.1/lib/zlib.js#L588-L593

Will be getting back to this to investigate. @chrisdickinson @thealphanerd

Related to: #3534

@jasnell jasnell self-assigned this Nov 11, 2015
@jasnell jasnell added zlib Issues and PRs related to the zlib subsystem. lts-watch-v4.x labels Nov 11, 2015
@jasnell
Copy link
Member Author

jasnell commented Nov 12, 2015

Further discussion in IRC points to: when passing a large amount of data into zlib, the readable state's highwatermark may max out. When this happens, the writable buffer cannot fully flush until the readable state is cleared.

@jasnell
Copy link
Member Author

jasnell commented Nov 12, 2015

The fix for this is to improve the documentation and provide an example of dealing with large amounts of compressed data.

@MylesBorins
Copy link
Contributor

@chrisdickinson / @jasnell please correct me if I am wrong, but does this not apply as a general case in Duplex / transform streams? If so should this be documented in Zlib or in the general stream documentation?

@chrisdickinson
Copy link
Contributor

@thealphanerd I'd lean towards noting this in the zlib docs. .flush (as opposed to ._flush) is specific to zlib streams, and doesn't quite do what it says on the tin — it queues a flush (potentially behind many pending writes) instead of immediately flushing.

@MylesBorins MylesBorins added the good first issue Issues that are suitable for first-time contributors. label Mar 30, 2016
@jasnell
Copy link
Member Author

jasnell commented Apr 12, 2016

@nodejs/documentation

@Knighton910
Copy link

zlib docs

screen shot 2016-04-11 at 10 04 03 pm

But this is noting on that isn't it?

@addaleax
Copy link
Member

@lordKnighton Not really… the current docs only refer to how .flush() affects the compression quality.

This is actually not very zlib-specific – the same mechanism is at work in code like this:

const stream = require('stream');
const s = new stream.PassThrough({
  highWaterMark: 5
});
s.write('Hello, World!', () => console.log('Wrote first chunk'));
s.write('', () => console.log('Wrote empty chunk'));

The second callback will not be invoked because the (Readable’s) highWaterMark effects exactly what it is supposed to, namely indicating that the stream does not want to produce more output when a given number of bytes has been buffered.

The only thing that’s a little specific to zlib here is the choice of the method name flush, as it is commonly associated with doing something more or less immediately. I don’t really know what a good addition to the docs could look like, but honestly, I think the description of highWaterMark as The maximum number of bytes to store in the internal buffer before ceasing to read from the underlying resource actually does pretty well here on its own…

@addaleax
Copy link
Member

What @chrisdickinson was referring to was that flush takes effect only after the current set of pending writes has been processed (i.e. on drain). That may actually be worth noting in the documentation? Either way that is actually not related to the original issue here, where there are no pending writes.

@Knighton910
Copy link

Nice response, 👍 for the explanation.

MylesBorins pushed a commit that referenced this issue Apr 20, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: #3782
PR-URL: #6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
MylesBorins pushed a commit that referenced this issue Apr 21, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: #3782
PR-URL: #6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
joelostrowski pushed a commit to joelostrowski/node that referenced this issue Apr 25, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: nodejs#3782
PR-URL: nodejs#6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
jasnell pushed a commit that referenced this issue Apr 26, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: #3782
PR-URL: #6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
MylesBorins pushed a commit that referenced this issue May 17, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: #3782
PR-URL: #6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
MylesBorins pushed a commit that referenced this issue May 18, 2016
Describe that `zlib.flush()` may wait for pending writes and
until output is being read from the stream.

Fixes: #3782
PR-URL: #6172
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Robert Jefe Lindstädt <robert.lindstaedt@gmail.com>
@sam-github sam-github added doc Issues and PRs related to the documentations. and removed doc Issues and PRs related to the documentations. labels Dec 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. good first issue Issues that are suitable for first-time contributors. zlib Issues and PRs related to the zlib subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants