Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: clarify out-of-bounds behavior of buf[index] #11286

Merged
merged 1 commit into from
Mar 28, 2017
Merged

doc: clarify out-of-bounds behavior of buf[index] #11286

merged 1 commit into from
Mar 28, 2017

Conversation

seishun
Copy link
Contributor

@seishun seishun commented Feb 10, 2017

Fixes: #11244

Checklist
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

doc

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. doc Issues and PRs related to the documentations. labels Feb 10, 2017
@seishun
Copy link
Contributor Author

seishun commented Feb 10, 2017

cc @joliss @ronkorving

Since this operator is inherited from `Uint8Array`, its behavior on
out-of-bounds access is exactly the same - i.e. getting returns `undefined`,
setting does nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I may suggest rewording a bit:

The value `undefined` will be returned when `index` is less than `0` or
greater than `buf.length`.

I'm not sure there's much value is calling out the inheritance from Uint8Array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems more wordy than necessary, and I'd have to repeat the same thing about setting. Also it would have to be "greater or equal".

I'm not sure there's much value is calling out the inheritance from Uint8Array

There are finer nuances in its behavior, for example how indexed access on a Buffer always ignores the prototype chain. Referencing Uint8Array means people know where to read about them if they're interested.

Copy link
Member

@Trott Trott Feb 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Don't use i.e. Too many people think it means "for example". (It means "that is" or "in other words".) Just spell out "for example" if that's what you mean or "in other words" if that's what you mean.

  • Don't use a - for a long stop. Just use a period and capitalize the next sentence.

Removing a few other unnecessary words yields something like this:

This operator is inherited from `Uint8Array`. Its behavior on out-of-bounds access is the
same as `UInt8Array`. For example,  getting returns `undefined` and setting does nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nikolai uses i.e. correctly here as far as I can tell: "id est", not "exempli gratia."

Copy link
Member

@Trott Trott Feb 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nikolai uses i.e. correctly here as far as I can tell: "id est", not "exempli gratia."

Good to know, but the fact that I couldn't tell kind of proves my point, I think.

I'm concerned about the reader either thinking i.e. means "for example" or else knowing better but not being certain that the author does. That second one is my experience pretty much every time I see i.e. on the web, especially in open source documentation. I have to stop and figure out if the author really meant i.e. or if they meant "for example". And sometimes it's hard to tell. Here's an example from our own docs (addons.md):

Because the exact path to the compiled Addon binary can vary depending on how
it is compiled (i.e. sometimes it may be in ./build/Debug/), Addons can use
the [bindings][] package to load the compiled module.

Reading that, I'm not sure if ./build/Debug/ is the only alternative to the implied canonical ./build/Release/ or if it's possible that there are other locations too. Even if i.e. is being used correctly in this case, it results in confusion and ambiguity (for me, at least) because I'm not sure that it's being used correctly.

For that reason, I always suggest eliminating i.e. (and e.g.) and just writing out what you mean to say. With i.e., a good replacement depends on context. It may be nothing at all (just remove i.e.) or "that is" or "that is to say" or "in other words" or even something like "therefore".

That said, I'm OK with i.e. if people think it's the better option. I prefer we don't use it, and I've explained above, but reasonable people can certainly have a different opinion than mine on this. :-D

@gibfahn
Copy link
Member

gibfahn commented Feb 12, 2017

This operator is inherited from `Uint8Array`, so its behavior on out-of-bounds
access is the same as `UInt8Array`. That is, getting returns `undefined` and
setting does nothing.

@Trott I actually find this less clear than it was before, IMHO the - i.e. made the link between the clauses clear (although I guess you could use a colon instead if you felt really strongly about it).


Don't use i.e. Too many people think it means "for example". (It means "that is" or "in other words".) Just spell out "for example" if that's what you mean or "in other words" if that's what you mean.

Are you saying that it shouldn't be used because lots of people won't understand it if they read it, or because you think using it here is wrong? If it's the first, I find i.e. much clearer here, and if someone misreads it as e.g. then that's not the end of the world here. If it's the second, then as (@bnoordhuis said) @seishun used it correctly AFAICT.


Don't use a – for a long stop. Just use a period and capitalize the next sentence.

Why? It seems like a good way to split two related parts of the sentence.


We seem to have a lot of rules about writing good docs, many of which seem quite prescriptive to me (although I'm no expert in writing documentation for an international audience). If we're going to ban things like i.e., e.g., and apostrophes, then we should document that, probably in a writing-docs.md in doc/guides.

My feeling is that on the whole we should avoid complex grammar (as that can be almost impossible to decipher for non-native speakers), but that vocabulary is generally not as much of a problem (it's usually a dictionary lookup). However it'd be good to hear from non-native English speakers.

@sam-github
Copy link
Contributor

This operator is inherited from Uint8Array, so its behavior on out-of-bounds
access is the same as UInt8Array. That is, getting returns undefined and
setting does nothing.

I'm not an expert in English grammar, but I don't think that second sentence is actually grammatically correct, can you start a sentence like that, with "That is, "?

@crandmck is an expert in grammar, and tech writing, and deeply involved in Express.js docs, do you want to weigh in on this?

I do think that if we are going to be very prescriptive on the doc wording, not just on technical accuracy (which we should be very picky about), but stylistic issues, it would be good to have some kind of shared understanding of the style, though I'm not sure exactly how to achieve that.

@sam-github
Copy link
Contributor

And that's what I'd be most interested in, @crandmck do you have any suggestions on achieving a common "voice" among such a disparate set of authors PRing the nodejs docs? Should we even worry about it? What is express doing, relying on you to copy edit? ;-)

@Trott
Copy link
Member

Trott commented Feb 12, 2017

@Trott I actually find this less clear than it was before, IMHO the - i.e. made the link between the clauses clear (although I guess you could use a colon instead if you felt really strongly about it).a

@gibfahn I definitely prefer i.e. to a colon here if those are the only choices. Maybe there's another option, though. How about a Therefore to make the connection clear?

Are you saying that it shouldn't be used because lots of people won't understand it if they read it, or because you think using it here is wrong?

Maybe I'm extrapolating too much from my own personal experience, but if I'm reading docs not written by a professional technical writer, and I see i.e., I often have to stop and figure out if the person is using it correctly or not. The fact that you know it's being used correctly here is all well and good, but what about someone who is not as knowledgable and therefore can't be so sure?

That said, if i.e. makes it easier to understand in your opinion, I'll defer to it. I have an opinion here, but this is not the place I'm going to make a stand. :-D

Why? It seems like a good way to split two related parts of the sentence.

Again, this is a nit and I'm totally willing to let it go, but here's the reason I recommend avoiding -:

Using a - invites typographic bike-shedding, much of which is probably region-dependent. I personally dislike - and consider it "wrong" because to me, that's a hyphen being used as a dash. In my world, it ought to be an em dash and have no spaces around it. But that might be specific to the US or to academic writing in the US or whatever. (EDIT: Our own style guide linked below says to use em dash surrounded by spaces, so according to our style guide, I got it half right.)

I'd prefer we avoid it altogether and use . and , where appropriate. If you really want to connect two clauses with punctuation suggesting a closer relationship, then a semicolon is a safe choice. And if you want to set off an aside, parentheses at least have the advantage of unambiguous typography and mostly-agreed-upon rules for usage that seem to elude hyphens/dashes in many situations.

then we should document that, probably in a writing-docs.md in doc/guides.

We do have a style guide at https://github.com/nodejs/docs/blob/master/STYLE-GUIDE.md.

My feeling is that on the whole we should avoid complex grammar (as that can be almost impossible to decipher for non-native speakers), but that vocabulary is generally not as much of a problem (it's usually a dictionary lookup). However it'd be good to hear from non-native English speakers.

Definitely agree 100% with all of that paragraph.

@Trott
Copy link
Member

Trott commented Feb 12, 2017

I'm not an expert in English grammar, but I don't think that second sentence is actually grammatically correct, can you start a sentence like that, with "That is, "?

@sam-github Maybe Therefore would be better than That is?

it would be good to have some kind of shared understanding of the style, though I'm not sure exactly how to achieve that.

There is a style guide at https://github.com/nodejs/docs/blob/master/STYLE-GUIDE.md but it's not very easily discoverable there. I like @gibfahn's idea of having a style guide in doc/guides in the main repo. Wherever it lives, we should probably link to it in CONTRIBUTING.md and maybe a few other places.

@gibfahn
Copy link
Member

gibfahn commented Feb 12, 2017

@Trott

There is a style guide at https://github.com/nodejs/docs/blob/master/STYLE-GUIDE.md but it's not very easily discoverable there.

😆 Yeah I had no idea that existed. I'll PR moving it to doc/guides and adding some links.

Using . Therefore getting here seems like the best of both worlds (to me).


@crandmck is an expert in grammar, and tech writing, and deeply involved in Express.js docs, do you want to weigh in on this?

Whether or not we move the Style Guide into the main repo it'd be great to have @crandmck look at it.

@seishun
Copy link
Contributor Author

seishun commented Feb 12, 2017

I'm not a huge fan of "therefore". It implies a causal relationship where there isn't one. Here's a more obvious example: "The behavior of Buffer#indexOf when the element isn't found is the same as Array#indexOf. Therefore, it returns -1".

@crandmck
Copy link

crandmck commented Feb 13, 2017

... do you have any suggestions on achieving a common "voice" among such a disparate set of authors PRing the nodejs docs? Should we even worry about it? What is express doing, relying on you to copy edit? ;-)

Other than having an individual person who is a good editor, your best bet is to have a comprehensive style guide. What you have in https://github.com/nodejs/docs/blob/master/STYLE-GUIDE.md is a start, but needs lots of work.
In particular, it needs to be organized into logical sections (grammar and usage, terminology, formatting, and so on). Developing a good style guide is non-trivial and typically it evolves over time. That being said, it's been done before, so there are lots of examples out there.

Express does have a minimal documentation style guide. I know we could definitely improve it as well...

The reality is that even having an extensive style guide won't eliminate all controversy over wording. Oftentimes there is more than one "right" answer and it comes down to a matter of opinion.

In general, my approach with usage is to err on the side of caution. Even things that seem obvious to English-speakers may be problematic for non-English speakers (or non-native speakers). For this reason, in Express we recommend avoiding Latin abbreviations ("e.g.," "i.e.," "etc.") for clarity. While it may be true that one can easily look up the meanings of these, that just adds "friction" to comprehension. Better to just make the wording dead simple where possible.


This operator is inherited from Uint8Array, so its behavior on out-of-bounds
+access is the same as UInt8Array. That is, getting returns undefined and
+setting does nothing.

Not to beat a dead horse, but since you asked, I'll give you my two cents worth on this particular PR. To me, the most problematic thing here is the phrase "...getting returns.." which is awkward and confusing. It's also best to specify what "this" means, provide links for references, and use active voice when possible. So (IIUC) I'd rewrite as:

The [index] operator comes from Uint8Array, so its behavior on out-of-bounds access is the same as UInt8Array. That is, getting the value an out-of-bounds element returns undefined and setting the value an out-of-bounds element does nothing.

@Trott
Copy link
Member

Trott commented Feb 14, 2017

The [index] operator comes from Uint8Array, so its behavior on out-of-bounds access is the same as UInt8Array. That is, getting the value an out-of-bounds element returns undefined and setting the value an out-of-bounds element does nothing.

That seems 👍 to me. I'd prefer adding of in two places: getting the value of an out-of-bounds element and setting the value of an out-of-bounds element. But either way.

@crandmck
Copy link

@Trott

I'd prefer adding of in two places

You are absolutely right. That was a typo/omission on my part.

@ronkorving
Copy link
Contributor

Let me go and LGTM. I think the language used is completely understandable. Does every documentation addition go through this level of scrutiny? :)

@seishun
Copy link
Contributor Author

seishun commented Feb 14, 2017

That is, getting the value an out-of-bounds element returns undefined and setting the value an out-of-bounds element does nothing.

I agree that "getting returns" is a bit clunky, but repeating "out-of-bounds" three times is worse IMO. And there's technically no such thing as an "out-of-bounds element". It's called out-of-bounds because there are no elements there.

@Trott
Copy link
Member

Trott commented Feb 14, 2017

@ronkorving asked:

Does every documentation addition go through this level of scrutiny?

No. It's usually much, much worse.

@seishun wrote:

I agree that "getting returns" is a bit clunky, but repeating "out-of-bounds" three times is worse IMO.

FWIW, out-of-bounds three times doesn't bother me. At least it's easy to understand.

And there's technically no such thing as an "out-of-bounds element".

out-of-bounds index would seem to be correct, although it would require some additional re-wording. Maybe something like this?:

The [index] operator comes from Uint8Array, so its behavior on out-of-bounds access is the same as UInt8Array. Attempting to get the value at an out-of-bounds index will result in undefined. Attempting to set the value at an out-of-bounds index does nothing.

(I'm reconsidering my previous opposition to using a colon after UInt8Array. I contain multitudes.)

@TimothyGu
Copy link
Member

Can we reach a conclusion here?

For documentation I would err on the side of clarity, even if doing so sacrifices brevity. @Trott's wording SGTM.

@Trott
Copy link
Member

Trott commented Mar 20, 2017

FWIW, although I had a bunch of comments, I do not object to this just landing as-is. I would prefer the wording I suggested, of course, but not so strongly that I would stop this over it.

@addaleax
Copy link
Member

This still LGTM, with the wording in this PR or with @Trott’s wording.

@seishun I’d say you can feel free to pick either and land this?

@fhinkel
Copy link
Member

fhinkel commented Mar 26, 2017

@seishun Can you squash the commits and land this? Thanks!

@seishun
Copy link
Contributor Author

seishun commented Mar 26, 2017

I decided to bring back the dash following @gibfahn's and @sam-github's comments (it doesn't look right with a colon). If no one feels strongly about it, I'll merge with the current wording.

PR-URL: #11286
Fixes: #11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
@seishun seishun merged commit 01ffe30 into nodejs:master Mar 28, 2017
@seishun
Copy link
Contributor Author

seishun commented Mar 28, 2017

Landed in 01ffe30.

@seishun seishun deleted the doc-oob branch March 28, 2017 08:43
MylesBorins pushed a commit that referenced this pull request Mar 28, 2017
PR-URL: #11286
Fixes: #11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
@MylesBorins MylesBorins mentioned this pull request Mar 28, 2017
@italoacasas italoacasas mentioned this pull request Apr 10, 2017
2 tasks
@MylesBorins
Copy link
Contributor

is this applicable to lts?

@seishun
Copy link
Contributor Author

seishun commented Apr 19, 2017

I think it is.

addaleax pushed a commit that referenced this pull request Apr 19, 2017
PR-URL: #11286
Fixes: #11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
MylesBorins pushed a commit that referenced this pull request Apr 24, 2017
PR-URL: #11286
Fixes: #11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
MylesBorins pushed a commit that referenced this pull request Apr 24, 2017
PR-URL: #11286
Fixes: #11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
@MylesBorins MylesBorins mentioned this pull request Apr 29, 2017
andrew749 pushed a commit to michielbaird/node that referenced this pull request Jul 19, 2017
PR-URL: nodejs/node#11286
Fixes: nodejs/node#11244
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ron Korving <ron@ronkorving.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. doc Issues and PRs related to the documentations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation: Unclear if Buffer buf[i] is bounds-checked