-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process: introduce codeGenerationFromString event #35157
process: introduce codeGenerationFromString event #35157
Conversation
This still feels like the wrong abstraction level. What if we did |
A flag won't be usable on many serverless environments, being able to set it at runtime (like this PR proposes) seems more appropriate. |
My main concern here is that some random library has no business knowing what i'm passing to How about this: const release = v8.setCodeGenerationFromStringHandler((code) => {});
// ...
release(); Once called, it can't be called again until |
This is an overall problem with the events we have on |
Agreed it would be great to see this kind of stuff moved out into loaders or a similar API as a higher level execution context. We previously discussed disabling high-level permissions on process, I would still like to see us deprecate more and more features from the process global over time. Even a new builtin module import for these features could allow policy scoping for who has access to it. But however it is cut, likely does seem to require a new API away from process and deprecating the old on process. |
@mmarchini do you have a constraint that this needs to be a process event? If not, I think we should go with the single-callback api. |
No constraints, as long as the handler can be defined during runtime I'm fine with it. |
That's a very fair point. As of today, almost everything in Node.js can be monkeypatched and leak implementation details. For instance. This is true for core modules, but also for built-in such as I am not sure I understand your secondary concern. This current PR does not change the |
@vdeturckheim are you saying you want random libraries to be able to tune into all evals and whatnot in the main context? |
@devsnek well right now, any library can spy on most of the things other pieces of code do. That's pretty much how APMs work. |
@vdeturckheim I can't tell if you're saying you need the event model or not. can you be a bit more direct? afaict the api I proposed above should work. |
@devsnek gotcha, sorry I missunderstood the question ^^. I only need to place one callback for my use case. But I am afraid there might be multiple libraries needing it (say, one RASP for security, one enterprise compliance tool and an APM), and a single callback would make the use of them exclusive. That's why I used an event emitter here. |
I guess in my ideal world I would like to see this: setCallback((code) => {
security(code);
compliance(code);
apm(code);
}) do people think this is unworkable? |
It would require the end user to set this up and assume no library does it without the end user knowing. Overall, I found out that in the scope of instrumentation, end users don't really want to know about implementation and vendors often do their own things without considering the other ones really. |
Well I won't block on this, but it is certainly not ideal. |
@cjihrig thanks for the review, it should be good :) |
@nodejs/process |
I don't understand the use case for this - why would I use this? |
@benjamingr there are multiple reasons why one could want to listen to |
Adding WIP flag as I need to fix a build issue (warning emitted when compiling the callback) |
process.on('codeGenerationFromString', common.mustCall((code) => { | ||
assert.strictEqual(code, 'item.foo++'); | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @vdeturckheim pointed out in a comment on this PR, there's something not working as intend in this callback:
If an error is thrown inside of this callback, it doesn't always get picked up. In this example code, the callback will get called, but the error will be silenced:
process.on('codeGenerationFromString', (code) => {
console.log('Hello from callback!')
throw new Error('boom!!!') // error is silenced
})
eval('1+1')
However, this code will not silence the error:
process.on('codeGenerationFromString', (code) => {
console.log('Hello from callback!')
throw new Error('boom!!!') // error is not silenced
})
console.log('result of eval:', eval('1+1'))
I found that queing a microtask after the call to eval is also sufficient:
process.on('codeGenerationFromString', (code) => {
console.log('Hello from callback!')
throw new Error('boom!!!') // error is not silenced
})
eval('1+1')
queueMicrotask(() => {})
However, process.nextTick(() => {})
is not.
On top of this, if the listener is removed before the microtask is queued, the error is once again swallowed:
process.on('codeGenerationFromString', (code) => {
console.log('Hello from callback!')
throw new Error('boom!!!') // error is silenced
})
eval('1+1')
process.removeAllListeners('codeGenerationFromString')
queueMicrotask(() => {})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial guess was that the error was handled asynchronously and hence if the program ended or the listener was removed before Node.js had a chance to handle the error then it would be lost. However, if I swap the removeAllListeners
and the queueMicrotask
calls in the last code-example above, the error is actually retained, which I don't think it should according to that theory. So the very act of queueing the microtask - not actually waiting for it to be executed - seems to be enough for it to retain the error before the listener is removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New discovery: If I add a line of invalid code after removing the listeners, I don't get any error complaining about that line either. So it looks like it doesn't get to that line at all:
process.on('codeGenerationFromString', (code) => {
console.log('Hello from callback!')
throw new Error('boom!!!') // error is silenced
})
eval('1+1')
process.removeAllListeners('codeGenerationFromString')
bad_code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to compile Node.js with debug mode enabled (./configure --debug && make -j8
) and ran it via lldb with breakpoints set to _exit
and exit
. And it looked like the program just ended normally like a regular program (which also explains the zero exit code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, with a lot of help from @addaleax I finally understand why the error was "silenced":
If V8 doesn't expect a callback given to it to call into JS-land, any JS exception thrown will not be properly handled. So an explicit try-catch has to be added to the C++ code that's calling into JS to ensure that the exception is handled. We have the same problem here:
Lines 153 to 159 in fa7cdd6
// V8 does not expect this callback to have a scheduled exceptions once it | |
// returns, so we print them out in a best effort to do something about it | |
// without failing silently and without crashing the process. | |
if (try_catch.HasCaught() && !try_catch.HasTerminated()) { | |
fprintf(stderr, "Exception in PromiseRejectCallback:\n"); | |
PrintCaughtException(isolate, env->context(), try_catch); | |
} |
I've updated this PR accordingly.
However, this unfortunately means that Node.js will not exit with an uncaught exception and so this isn't easily tested in our tests (except if we check the output to stderr
I guess). And it also can be quite confusing for users if they except Node.js to crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would process.on('uncaughtexception')
still fire?
Thanks a lot for digging on this bug. this is amazing finding!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdeturckheim No it will not. The exception is now technically handled
Should dynamic import of |
@guybedford maybe 🤔 tbh, this mostly mimics the hook in V8 right now. In theory, but we could try to align on trusted types sinks. But it depends what V8 exposes. It is already a good first step in it's current form imho. |
I think v8 should be updated to handle these exceptions rather than the current hack. Looking at the code I don't think it would be very complex and we should be able to backport it. |
@devsnek that would be preferable sure (though we already use this hack in other parts of core - so I'm not sure how important it is??). Do you know what the turnaround time for this sort of thing to be implemented, released and backported in V8 would normally be? |
not sure how the thanksgiving holiday changes things but for a small change usually a day or two to get it into v8 and then you just have to backport it into node. |
@devsnek haha that's great. I was just hoping we were not talking several months |
The 'codeGenerationFromString' event is emitted when a call is made to `eval` or `new Function`. Co-authored-by: Thomas Watson <w@tson.dk>
@devsnek what's the process for getting that work started in V8 and do you know if there's anything I can do to help? |
Small discovery which might just be related to crossing the JS/C++ boundary: If I want to get the filename, line and column number from where process.on('codeGenerationFromString', (code) => {
// print the frame in the stack trace where `eval` was called
console.log('Detected code generation from string', new Error().stack.split('\n')[3].trim())
})
eval('1 + 1') // should be line 6, col 1 - this is also what is printed
someFn(eval('1 + 1')) // should be line 7, col 8 - is actually line 7, col 1
someFn(
eval('1 + 1') // should be line 9, col 3 - is actually line 8, col 1
)
someFn(42, eval('1 + 1')) // should be line 11, col 12 - is actually line 11, col 1
someFn(
42,
eval('1 + 1') // should be line 14, col 3 - is actually line 12, col 1
) Am I correct in my assumption that this is just an unfortunate side effect of crossing the JS/C++ boundary and that nothing can be done about it? For reference, this is what the complete stack trace looks like for the first
Update: I found that this is only an issue for |
if you want to wait for someone from v8 to do this it probably will be several months. it should be a pretty simple change if you want to do it though, you just basically need to go around adding |
@vdeturckheim the issue is just in terms of it being a security property ensuring the property is exhaustive is important, and it could be possible to write a custom interception that shares the same hook as they are the same thing from a user perspective. |
Can I step this back just a bit further and ask why we'd want to do this? What are the use cases here outside of logging or preventing eval by throwing in the handler? |
@jasnell in our case we want to audit code generation from strings as simply using |
Ok. Do you actually need access to code string being evaluated for that? Or just a stack to get the calling frame? |
@guybedford That what I meant by mentioning trusted types. There are multiple sinks for code generation from strings and the API should/could cover them all, so +1 :D @jasnell in my use case, I need to know the code to identify its origin, for instance, this is used to detect RCEs by comparing the argument in the evil |
@jasnell In my case all I need is the calling stack frame |
Co-authored-by: James M Snell <jasnell@gmail.com>
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesThis is an alternative to #34863 following a suggestion from @addaleax.
As for #34863, the goal is to provide a way to listen to unsafe code generation from strings as it is not possible to monkeypatch
eval
.The event will only be emitted if there is at least one listener on it and removing all listeners on this event will result in the handler in V8 to never be called. In other words, if there is no listener on this event, there should be no performance impact on calling
eval
or theFunction
constructor.