Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a --warn-on-late-sync flag for tracing synchronous calls #1674

Closed
benjamingr opened this issue May 11, 2015 · 22 comments
Closed

Provide a --warn-on-late-sync flag for tracing synchronous calls #1674

benjamingr opened this issue May 11, 2015 · 22 comments
Labels
feature request Issues that request new features to be added to Node.js.

Comments

@benjamingr
Copy link
Member

What

io.js should provide a --warn-on-late-sync that would detect whenever a *Sync method has been called after the first full round of the event loop and emit a warning to stderr.

setTimeout(function(){
     fs.readFileSync("/foo/bar");
})
$ io.js file.js --warn-on-late-sync

Warning fs.readFileSync (file.js 3:7) called after a full turn of the event loop has completed. This may cause performance bottlenecks in your code (option --warn-on-late-sync). Please see http://iojs.org/... for more details.

Comments about semantics

  • Warnings are only emitted after a full turn of the event loop has happened.
  • Warnings are not emitted in case the process is terminating (that is, in process.on("exit" code)
  • Warnings are emitted to the stderr stream, preferably with a stack trace

    Why

This flag would be very useful at finding out performance bottlenecks of hidden *Sync calls and alerting users as well as hopefully explaining what they can do about it (switch to non-*Sync) methods.

Breaking changes

This issue contradicts #1665 by making it opt-in and not opt out, the benefit is that it enables a lot of what that issue enables without breaking any existing code or forcing users to do anything about it.

Future PoV

It can also be a start for a future --pedantic flag that warns against the most common performance pitfalls node servers have that can't be alerted from the userland.

(I opened this issue per @Fishrock123 's request of opening a separate issue)

I'm open to naming suggestions of course, I'd like this to be brought up in the TC to see if the committee doesn't find any implementation limitations or compelling reasons not to do this before any work is initiated. This is of course in addition to any comments being welcome on this issue

@mikeal
Copy link
Contributor

mikeal commented May 11, 2015

could we do "warn on late sync" instead?

@benjamingr
Copy link
Member Author

@mikeal that was the idea, now that I read it the name is confusing I'll edit it.

@benjamingr benjamingr changed the title Provide a --no-late-sync flag for tracing synchronous calls Provide a --warn-on-late-sync flag for tracing synchronous calls May 11, 2015
@Fishrock123 Fishrock123 added the feature request Issues that request new features to be added to Node.js. label May 11, 2015
@ChALkeR
Copy link
Member

ChALkeR commented May 11, 2015

While this would be useful for debugging, it could never provide full coverage of various code paths (because of being a runtime flag). But that's probably inevitable.

An alternative to that that would allow static check would be to provide a *Sync0 (for example) versions of all *Sync methods that would work only on the first tick. That would be an opt-in then for all the module authors, and could be checked with a single grep. But that solution would be incompatible with previous runtime versions and too bloated. Any other ideas?

@benjamingr
Copy link
Member Author

An alternative to that that would allow static check would be to provide a *Sync0 (for example) versions of all *Sync methods that would work only on the first tick.

If we're adding methods it's just as easy to provide a userland solution to do this which does something like:

// just one example
var fs = require("fs")

var firstTick = true; // minus when required issue, ask to require on first go
setImmediate(function(){
    firstTick = false; // nest or wait more or whatever.
})
module.exports.readFileSync = function(path){
    if(!nextTick && processIsNotTerminating()) console.error("Warning......     ");
    return fs.readFileSync(path);
};

Or something along these lines. The only reason I'm asking for this is because such a solution would not be very practical in practice for people who're using fs.

@trevnorris
Copy link
Contributor

Would this work at the process.binding() level, or just at the require() level?

@benjamingr
Copy link
Member Author

@trevnorris sorry for the ignorance, but what difference does it make from a user PoV?

@trevnorris
Copy link
Contributor

@benjamingr process.binding() gives direct access to the C++ guts. We say it shouldn't be used by application developers, but it is sometimes. Basically your userland module example wouldn't have a clue that I'm doing sync read/write if I performed all operations via process.binding('fs')

@benjamingr
Copy link
Member Author

Thanks, from what I can tell process.binding("fs") (only checked FS) does not expose any synchronous methods. I'm probably not one to judge or make a call though since I don't understand internals well enough.

@trevnorris
Copy link
Contributor

@benjamingr you just omit the callback to process.binding('fs').read() and it's sync.

@benjamingr
Copy link
Member Author

Thanks, I didn't know that. That makes your question make sense to me now.

This means a balance between catching more use cases and touching the C++
code. Would it mean anything performance wise or code-complexity to do this
at the process.binding level?

On Mon, May 11, 2015 at 9:41 PM, Trevor Norris notifications@github.com
wrote:

@benjamingr https://github.com/benjamingr you just omit the callback to
process.binding('fs').read() and it's sync.


Reply to this email directly or view it on GitHub
#1674 (comment).

@trevnorris
Copy link
Contributor

not really. honestly we could just add an fprintf(stderr, "you're using sync!!!") to the C++ code if no callback is passed. though we'd also need to include the stack trace.

@benjamingr
Copy link
Member Author

Then it sounds like doing it at the process.binding level makes the most sense :) Are there any C++ variables that detail that an event loop turn has already passed?

@trevnorris
Copy link
Contributor

Nope. We'd have to implement the logic internally to call a callback after the first full run of the event loop and then do a check in the C++ call like env->first_loop_complete().

@vkurchatkin
Copy link
Contributor

I think it should be on js side. process.binding is private, no problem here

@trevnorris
Copy link
Contributor

@vkurchatkin Sure. I just want to make sure all participants in this discussion are aware that it's not a proper safety mechanism. More a convince check.

@ChALkeR
Copy link
Member

ChALkeR commented May 11, 2015

@trevnorris Well, nothing would ever be a «proper» safety mechanism here. :-) One can always make a native module that does sync stuff. But that's ok, isn't it? I am not even sure if process.binding needs to be taken into an account. Maybe just simply hooking to *Sync will do?

@trevnorris
Copy link
Contributor

@ChALkeR My definition of "proper" is that something cannot be performed in any way from just JS. It's a very clear line that's easy to check. :) And as noted in this thread @benjamingr didn't realize that a user could bypass the normal JS check. My mentioning this was really so everyone involved realized it was possible.

@CrabDude
Copy link

👍 My preference would be for this to be opt-out eventually, but --pedantic may be more pragmatic.

While preference should be given for userland implementations, that's not the current standard (despite claims otherwise). This is solved/solveable in userland, but so are http, https, url, parse, etc... etc...

In general, *Sync calls decrease performance by 100% for IO-bound tasks (See my comment on 1665) and are typically buried in package methods not properly labeled with *Sync.

@benjamingr
Copy link
Member Author

@CrabDude

This is solved/solveable in userland, but so are http, https, url, parse, etc... etc...

Well, If you create a userland implementation that solves the issues @trevnorris raised here (access to process.binding) as well as other core modules that expose sync modules (vs. just fs) it would definitely be interesting to see if this can be placed in the official documentation and/or gain some sort of official recognition. I'm pretty sure this was discussed before (organization adopted modules) and it's definitely something that can be further discussed.

I think no-one disagrees that *Sync undermines the concurrency model of a node/iojs server but it is also as clear that parts of the community use the *Sync methods (even experienced users) and we can't break their code (and we don't even all agree that we should).

The documentation currently says:

In busy processes, the programmer is strongly encouraged to use the asynchronous versions of these calls. The synchronous versions will block the entire process until they complete--halting all connections.

If you can come up with a way that says that in a more convincing tone then you should consider opening a docs PR. I think it would definitely make for an interesting PR and issue, and I'd also definitely be interested in seeing statistics in the wild that people actually do this for servers.

@CrabDude
Copy link

@benjamingr Will do. Thanks for the feedback.

@ChALkeR
Copy link
Member

ChALkeR commented May 15, 2015

Cross-linking: #1707

trevnorris added a commit to trevnorris/node that referenced this issue May 17, 2015
Use the --warn-on-sync flag to print a stack trace whenever a sync
method is used. (e.g. fs.readFileSync()) It does not track if the
warning has occurred at a specific location in the past and so will
print the warning every time.

This does not print the warning for the first synchronous execution of
the script or during the process 'exit' event. Allowing all the
require(), etc. statements to run that are necessary to prepare the
environment.

Fixes: nodejs#1674
PR-URL: nodejs#1707
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
trevnorris added a commit to trevnorris/node that referenced this issue May 18, 2015
Use the --warn-on-late-sync flag to print a stack trace whenever a sync
method is used after the first tick, excluding during the process exit
event. (e.g. fs.readFileSync()) It does not track if the warning has
occurred at a specific location in the past and so will print the
warning every time.

Reason for not printing during the first tick of the appication is so
all necessary resources can be required. Also by excluding synchronous
calls during exit is necessary in case any data needs to be logged out
by the application before it shuts down.

Fixes: nodejs#1674
PR-URL: nodejs#1707
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
trevnorris added a commit to trevnorris/node that referenced this issue May 18, 2015
Use the --trace-sync-io flag to print a stack trace whenever a sync
method is used after the first tick, excluding during the process exit
event. (e.g. fs.readFileSync()) It does not track if the warning has
occurred at a specific location in the past and so will print the
warning every time.

Reason for not printing during the first tick of the appication is so
all necessary resources can be required. Also by excluding synchronous
calls during exit is necessary in case any data needs to be logged out
by the application before it shuts down.

Fixes: nodejs#1674
PR-URL: nodejs#1707
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
trevnorris added a commit to trevnorris/node that referenced this issue May 18, 2015
Use the --trace-sync-io flag to print a stack trace whenever a sync
method is used after the first tick, excluding during the process exit
event. (e.g. fs.readFileSync()) It does not track if the warning has
occurred at a specific location in the past and so will print the
warning every time.

Reason for not printing during the first tick of the appication is so
all necessary resources can be required. Also by excluding synchronous
calls during exit is necessary in case any data needs to be logged out
by the application before it shuts down.

Fixes: nodejs#1674
PR-URL: nodejs#1707
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
trevnorris added a commit to trevnorris/node that referenced this issue May 18, 2015
Use the --trace-sync-io flag to print a stack trace whenever a sync
method is used after the first tick, excluding during the process exit
event. (e.g. fs.readFileSync()) It does not track if the warning has
occurred at a specific location in the past and so will print the
warning every time.

Reason for not printing during the first tick of the appication is so
all necessary resources can be required. Also by excluding synchronous
calls during exit is necessary in case any data needs to be logged out
by the application before it shuts down.

Fixes: nodejs#1674
PR-URL: nodejs#1707
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
@trevnorris
Copy link
Contributor

This should be fixed by c1de6d2.

andrewdeandrade pushed a commit to andrewdeandrade/node that referenced this issue Jun 3, 2015
Use the --trace-sync-io flag to print a stack trace whenever a sync
method is used after the first tick, excluding during the process exit
event. (e.g. fs.readFileSync()) It does not track if the warning has
occurred at a specific location in the past and so will print the
warning every time.

Reason for not printing during the first tick of the appication is so
all necessary resources can be required. Also by excluding synchronous
calls during exit is necessary in case any data needs to be logged out
by the application before it shuts down.

Fixes: nodejs/node#1674
PR-URL: nodejs/node#1707
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Petka Antonov <petka_antonov@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js.
Projects
None yet
Development

No branches or pull requests

7 participants