buffer: fix Buffer.isEncoding() return value for empty string #12847

DavidCai1111 · 2017-05-05T08:00:06Z

For now the return value of Buffer.isEncoding('') is true which seems not the preferred behavior:

> Buffer.isEncoding('')
true

This PR is to fix this.

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
commit message follows commit guidelines

Affected core subsystem(s)

mscdex · 2017-05-05T08:40:25Z

lib/buffer.js

@@ -375,6 +375,7 @@ Buffer.compare = function compare(a, b) {

 Buffer.isEncoding = function(encoding) {
  return typeof encoding === 'string' &&
+         encoding !== '' &&


I'd prefer encoding.length > 0 &&

@mscdex Updated, PTAL :=)

jasnell · 2017-05-05T08:56:07Z

Hmm... given that '' is generally equivalent to undefined, and the behavior is to default to utf-8 when encoding is undefined, I actually think this is expected behavior. It may not be the preferred behavior, tho. Either way, I think this would need to be a semver-major change.

lpinca · 2017-05-05T09:04:22Z

test/parallel/test-buffer-isencoding.js

@@ -17,7 +17,8 @@ const assert = require('assert');
    assert.strictEqual(Buffer.isEncoding(enc), true);
  });

-[ 'utf9',
+[ '',
+  'utf9',


Unrelated to this change but I think it makes sense to add undefined.

@lpinca Done :=)

and null, please

@sam-github Done, updated :=)

I also think that after line 35 the encoding should actually be used (perhaps Buffer.from()?), its important that the isEncoding() should agree with actual behaviour of APIs that use encodings, and its not clear to me ATM that it does, so I think the assert should ensure consistency.

@sam-github So i think the problem we are facing now is that those APIs which use encodings identify invalid encodings inconsistently, and we can't tell whose behavior is actually right?

I have no idea, I just did a quick check, and '' was accepted as an encoding, but you said my node version was not master, so perhaps my check was wrong? I'm not entirely clear on what the purpose of the API is, but if it is intended to find encoding values that don't work, it should agree with usage.

So, are various Buffer APIs inconsistent? If so, that would be a good thing to fix! :-)

cjihrig

LGTM. I thought this was by design, but I'm not sure. cc: @nodejs/buffer

DavidCai1111 · 2017-05-05T09:11:26Z

@jasnell Hum... so I've edited the PR description (s/expected/preferred/) and added a semver-major label to it 🤔

benjamingr · 2017-05-05T12:05:13Z

I think this behavior is by-design. isEncoding is described as:

Returns true if encoding contains a supported character encoding, or false otherwise.

Since '' can be passed as an encoding, it should return true.

benjamingr · 2017-05-05T12:06:16Z

I'm fine with this change if the CTC signs off on it and CITGM passes without incident. I think the current behavior is marginally better - but I have no strong feelings about that - and I want the change to be an informed decision by the project.

sam-github · 2017-05-05T12:45:35Z

I have same questions as above, '' appears to be a valid encoding:

> Buffer.from("ab", '')
<Buffer 61 62>
> Buffer.from("ab", 'sam')
TypeError: "encoding" must be a valid string encoding
    at fromString (buffer.js:199:11)
    at Function.Buffer.from (buffer.js:104:12)
    at repl:1:8
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)
    at REPLServer.emit (events.js:191:7)
> Buffer.from("ab", '')
<Buffer 61 62>
> Buffer.from("ab", undefined)
<Buffer 61 62>
> Buffer.from("ab", null)
<Buffer 61 62>

lpinca · 2017-05-05T12:59:43Z

@sam-github so are null and undefined in that case but

> Buffer.isEncoding(undefined)
false
> Buffer.isEncoding(null)
false
>

DavidCai1111 · 2017-05-05T13:58:25Z

@sam-github And when it comes to buf.toString in master branch, '' and null both become to invalid...:

> Buffer.from('foo').toString('')
TypeError: Unknown encoding:
    at stringSlice (buffer.js:558:9)
    at Buffer.toString (buffer.js:594:10)
    // ...

> Buffer.from('foo').toString(null)
TypeError: Unknown encoding: null
    at stringSlice (buffer.js:558:9)
    at Buffer.toString (buffer.js:594:10)
    // ...

> Buffer.from('foo').toString(undefined)
'foo'

DavidCai1111 · 2017-05-05T14:31:11Z

CI: https://ci.nodejs.org/job/node-test-pull-request/7897/

sam-github · 2017-05-05T14:50:34Z

test/parallel/test-buffer-isencoding.js

@@ -17,7 +17,8 @@ const assert = require('assert');
    assert.strictEqual(Buffer.isEncoding(enc), true);
  });

-[ 'utf9',
+[ '',
+  'utf9',


I also think that after line 35 the encoding should actually be used (perhaps Buffer.from()?), its important that the isEncoding() should agree with actual behaviour of APIs that use encodings, and its not clear to me ATM that it does, so I think the assert should ensure consistency.

BridgeAR · 2017-08-27T00:44:09Z

@DavidCai1993 this needs a rebase and there is one open comment

shinnn · 2017-09-19T15:15:47Z

Note:

This PR does almost the same change as buffer: empty string is not a valid encoding #9661 does, though buffer: empty string is not a valid encoding #9661 is already closed.
Probably it's better to reach a consensus on the design of Buffer.isEncoding in Buffer.isEncoding regards an empty string as a valid encoding #9654 first of all. The goal of this PR seems different from the most supported suggestion in Buffer.isEncoding regards an empty string as a valid encoding #9654:

Another alternative is to simply state in the documentation that Buffer.from() accepts them from legacy reasons but that Buffer.isEncoding() is the one source of truth. Not a bad option either.

BridgeAR · 2017-11-22T12:33:07Z

Closing due to long inactivity. There is actually little to do besides rebasing and the single comment and it would make sense to reopen or to open a new PR if someone wants to follow up on this.

nodejs-github-bot added the buffer Issues and PRs related to the buffer subsystem. label May 5, 2017

mscdex reviewed May 5, 2017

View reviewed changes

DavidCai1111 force-pushed the buffer/is-encoding-empty-string branch from 3acad90 to 9e55470 Compare May 5, 2017 08:59

DavidCai1111 added the semver-major PRs that contain breaking changes and should be released in the next major version. label May 5, 2017

lpinca reviewed May 5, 2017

View reviewed changes

lpinca approved these changes May 5, 2017

View reviewed changes

cjihrig approved these changes May 5, 2017

View reviewed changes

DavidCai1111 force-pushed the buffer/is-encoding-empty-string branch from 9e55470 to 539dd23 Compare May 5, 2017 09:23

buffer: fix isEncoding result for empty string

78b41da

DavidCai1111 force-pushed the buffer/is-encoding-empty-string branch from 539dd23 to 78b41da Compare May 5, 2017 14:26

sam-github suggested changes May 5, 2017

View reviewed changes

BridgeAR closed this Nov 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: fix Buffer.isEncoding() return value for empty string #12847

buffer: fix Buffer.isEncoding() return value for empty string #12847

DavidCai1111 commented May 5, 2017 •

edited

Loading

mscdex May 5, 2017

DavidCai1111 May 5, 2017

jasnell commented May 5, 2017

lpinca May 5, 2017

DavidCai1111 May 5, 2017 •

edited

Loading

sam-github May 5, 2017

DavidCai1111 May 5, 2017

sam-github May 5, 2017 •

edited by gibfahn

Loading

DavidCai1111 May 5, 2017 •

edited

Loading

sam-github May 5, 2017

cjihrig left a comment

DavidCai1111 commented May 5, 2017

benjamingr commented May 5, 2017

benjamingr commented May 5, 2017

sam-github commented May 5, 2017

lpinca commented May 5, 2017

DavidCai1111 commented May 5, 2017 •

edited

Loading

DavidCai1111 commented May 5, 2017 •

edited

Loading

sam-github May 5, 2017 •

edited by gibfahn

Loading

BridgeAR commented Aug 27, 2017

shinnn commented Sep 19, 2017 •

edited

Loading

BridgeAR commented Nov 22, 2017

buffer: fix Buffer.isEncoding() return value for empty string #12847

buffer: fix Buffer.isEncoding() return value for empty string #12847

Conversation

DavidCai1111 commented May 5, 2017 • edited Loading

Checklist

Affected core subsystem(s)

mscdex May 5, 2017

Choose a reason for hiding this comment

DavidCai1111 May 5, 2017

Choose a reason for hiding this comment

jasnell commented May 5, 2017

lpinca May 5, 2017

Choose a reason for hiding this comment

DavidCai1111 May 5, 2017 • edited Loading

Choose a reason for hiding this comment

sam-github May 5, 2017

Choose a reason for hiding this comment

DavidCai1111 May 5, 2017

Choose a reason for hiding this comment

sam-github May 5, 2017 • edited by gibfahn Loading

Choose a reason for hiding this comment

DavidCai1111 May 5, 2017 • edited Loading

Choose a reason for hiding this comment

sam-github May 5, 2017

Choose a reason for hiding this comment

cjihrig left a comment

Choose a reason for hiding this comment

DavidCai1111 commented May 5, 2017

benjamingr commented May 5, 2017

benjamingr commented May 5, 2017

sam-github commented May 5, 2017

lpinca commented May 5, 2017

DavidCai1111 commented May 5, 2017 • edited Loading

DavidCai1111 commented May 5, 2017 • edited Loading

sam-github May 5, 2017 • edited by gibfahn Loading

Choose a reason for hiding this comment

BridgeAR commented Aug 27, 2017

shinnn commented Sep 19, 2017 • edited Loading

BridgeAR commented Nov 22, 2017

DavidCai1111 commented May 5, 2017 •

edited

Loading

DavidCai1111 May 5, 2017 •

edited

Loading

sam-github May 5, 2017 •

edited by gibfahn

Loading

DavidCai1111 May 5, 2017 •

edited

Loading

DavidCai1111 commented May 5, 2017 •

edited

Loading

DavidCai1111 commented May 5, 2017 •

edited

Loading

sam-github May 5, 2017 •

edited by gibfahn

Loading

shinnn commented Sep 19, 2017 •

edited

Loading