Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringDecoder#end doesn't flush state #16564

Closed
jridgewell opened this issue Oct 27, 2017 · 5 comments
Closed

StringDecoder#end doesn't flush state #16564

jridgewell opened this issue Oct 27, 2017 · 5 comments
Labels
string_decoder Issues and PRs related to the string_decoder subsystem.

Comments

@jridgewell
Copy link
Contributor

jridgewell commented Oct 27, 2017

  • v8.6.0
  • Darwin Kernel Version 16.6.0: Fri Apr 14 16:21:16 PDT 2017; root:xnu-3789.60.24~6/RELEASE_X86_64 x86_64
  • StringDecoder

StringDecoder#end doesn't flush the state, allowing it to accept the completing byte afterwards:

const { StringDecoder } = require('string_decoder');
const decoder = new StringDecoder('utf8');

decoder.write(Buffer.from([0xE2, 0x82])); // => ''
// We're waiting for the final byte (currently at byte 2 of a 3 byte sequence)
// Now, end the buffer because, eg, the client disconnected.
decoder.end(); // => '�' (UTF-8 Replacement Character)

Now, there's our setup. If we've ended the buffer and it has outputted a replacement character, it kind of implies it's back at its initial state. But:

decoder.write(Buffer.of(0x61)); // => '�a'

Or, we could "finish" that 3 byte sequence from above (if we haven't already written the 'a'):

decoder.write(Buffer.of(0xAC)); // => '€'
@mscdex mscdex added the string_decoder Issues and PRs related to the string_decoder subsystem. label Oct 27, 2017
@addaleax
Copy link
Member

@jridgewell Do you have any suggestions on what to do here? Should .write() after .end() throw, like it does for streams?

@TimothyGu
Copy link
Member

TextDecoder resets its state after the equivalent of an end call, and functions as a new TextDecoder instance basically. That sounds fair to me.

@mscdex
Copy link
Contributor

mscdex commented Oct 27, 2017

Make sure to check benchmarks on this one because of how widely string_decoder is used.

@jridgewell
Copy link
Contributor Author

I think throwing is better than the current behavior, but I’d prefer it just reset and behave like a brand new instance.

@thefourtheye
Copy link
Contributor

Proposed a fix in #16594. I chose to reset silently instead of throwing.

jridgewell added a commit to jridgewell/node that referenced this issue Feb 1, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

Refs: nodejs#16594
Fixes: nodejs#16564
MylesBorins pushed a commit that referenced this issue Feb 20, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this issue Feb 21, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this issue Feb 21, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this issue Mar 20, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this issue Mar 28, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this issue Mar 30, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: #18494
Fixes: #16564
Refs: #16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MayaLekova pushed a commit to MayaLekova/node that referenced this issue May 8, 2018
This resets the StringDecoder's state after calling `#end`. Further
writes to the decoder will act as if it were a brand new instance,
allowing simple reuse.

PR-URL: nodejs#18494
Fixes: nodejs#16564
Refs: nodejs#16594
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
abhishekumar-tyagi pushed a commit to abhishekumar-tyagi/node that referenced this issue May 5, 2024
Fixes: nodejs/node#16564

When StringDecoder's `end` is called, it is no longer supposed to wait
for the data. If a `write` call is made after `end`, then the decoder
has to be flushed and treated as a brand new write request.

This patch also introduces a new StringDecoder#reset method, which
simply resets all the internal data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
string_decoder Issues and PRs related to the string_decoder subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants