Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify "report an exception" #958

Closed
Ms2ger opened this issue Mar 29, 2016 · 32 comments
Closed

Clarify "report an exception" #958

Ms2ger opened this issue Mar 29, 2016 · 32 comments

Comments

@Ms2ger
Copy link
Member

Ms2ger commented Mar 29, 2016

the user agent must report the error for the relevant script

What is "the relevant script"?


Edit by @domenic: the current plan to be executed in order to fix this bug is #958 (comment). Comments between here and there are by now somewhat misleading and historical

@Ms2ger
Copy link
Member Author

Ms2ger commented Mar 29, 2016

In particular in relation to the tests in https://bugzilla.mozilla.org/show_bug.cgi?id=1259784

@domenic
Copy link
Member

domenic commented May 11, 2016

I at first thought this was best solved by having "report an exception" explicitly specify which global the error goes to. However, I realize that doesn't really work, as we actually need the information on which script caused the exception, for filename (and line/column number) purposes.

I am pretty sure that the relevant script should generally be GetActiveScriptOrModule's [[HostDefined]] value (i.e. its corresponding HTML "script").

As discussed in the new #1189, there are scenarios where this returns null. Maybe this means that #1189 needs to expand to have a "backup incumbent script stack", not a "backup incumbent settings object stack".

/cc @bzbarsky.

@domenic
Copy link
Member

domenic commented May 11, 2016

Dang, nope, that makes no sense, because of cases like the user clicking on a button with throwing error handlers. In that case there's no GetActiveScriptOrModule() and no backup possible while inside the "dispatch an event" algorithm. Nevermind...

@bzbarsky
Copy link
Contributor

The way this was meant to work, which everyone agreed on back when this was last discussed and when Hixie wrote the spec for this, before the later changes to it was like so:

  1. When invoking callbacks the entry settings represent the thing being invoked and the incumbent settings represent the entity that added the callback.
  2. Error reporting in the simple window1.setTimeout(functionFromWindow2) case should happen on window2, not window1. Conveniently, that corresponds to the entry settings object for the call in that case.

At the time, it seemed to me that the "invoking callbacks" section in HTML made this all fairly clear. It looks like that's all been removed since then, so now nothing specifies things... It doesn't help (but is far for the course) that some UAs agreed on all the above at the time and then never actually changed their behavior to align with the resulting spec.

@domenic
Copy link
Member

domenic commented May 12, 2016

So, looking through the old https://html.spec.whatwg.org/commit-snapshots/5fa33f072011f29ed56adb0f2f63bb4404c92aae#calling-scripts, I don't see how this was made clear. Error reporting has always been ambiguous and said

When the user agent is to report an exception E, the user agent must report the error for the relevant script, with the problematic position (line number and column number) in the resource containing the script, using the global object specified by the script's settings object as the target.

As I said over in whatwg/webidl#113 (comment) (sorry for splitting the discussion), I see two options for specifying this. The one not based on entry settings objects, but instead based on the settings object of the script that declared the function that threw an exception, seems simpler to me, since we need that script for its filename/line number/column number anyway.

@bzbarsky
Copy link
Contributor

The function that threw an exception may not have been declared in a script at all. It can be a built-in function.

@bzbarsky
Copy link
Contributor

And I guess it's possible that after the various discussion about this stuff things never made it back into the spec after all. :( I distinctly recall there being language proposed that made all this much clearer than what you link to. :(

@domenic
Copy link
Member

domenic commented May 12, 2016

For posterity:

@bzbarsky and I worked out a plan for fixing this in whatwg/webidl#113 (comment) (steps 2-4) plus @bzbarsky's subsequent reply (on which I agree with all points).

I've explained where this fits in my priority queue of "fix script execution" at the bottom of whatwg/webidl#113 (comment).

@domenic
Copy link
Member

domenic commented Jun 15, 2016

The plan:

Fix "report the exception"/"report an error"

Happening in #10404.

The goal is to consolidate both of the existing algorithms "report the exception" and "report an error" into a single new algorithm, and update all call sites.

  • The new algorithm's name will be "report an exception". (This is intentionally slightly different from both old algorithms. We will break spec cross-references in this way, to help move the ecosystem, but we can preserve the IDs so that actual links are not broken.)
  • Its arguments will be an exception value, a global object, and an optional script
  • We can get rid of the "handled" and "not handled" concepts, and instead include the line about reporting the error to the developer console directly inside the algorithm.

The contents of this algorithm will be based on "report an error", with the following notable differences:

  • message, line, col, and filename are derived from the exception value, in an unspecified way. This allows the better results that in practice browsers exhibit for e.g. eval code. In the future, this may become specified in detail when the JavaScript spec specifies error stack traces in detail.

  • The optional script will be used to determine muting (and not to determine the filename). If it is not supplied, assume the error is not muted.

  • As noted above, incorporate the handled/not handled action directly into the algorithm.

Then, update all call sites of these two algorithms. Only do this for HTML for now, and it would be done in the same PR that updates the algorithm.

Port these changes to rejected promises

Now tracked in #10515.

This might be best done as part of the previous PR. Essentially, get rid of the "handled" and "not handled" concepts, and incorporate the developer console part directly into the algorithm.

Change how web developer code invocation works in Web IDL

Now tracked in whatwg/webidl#1423.

We add a new optional named parameter rethrowExceptions (a boolean) to the following three algorithms in the Web IDL spec:

It defaults to false. When it is false, we introduce the following new behavior: any exceptions thrown by web developer code get caught and routed to HTML's new "report the exception". ("Web developer code" is roughly everything between the "prepare" and "cleanup" steps, including both argument conversion and the actual code invocation.)

The exact global and script to pass for these is unclear at this time. We can initially leave them as <span class="XXX">unclear which global</span>, <span class="XXX">unclear which script</span> and work on that in later steps.

Fix all call sites

Tracked as part of whatwg/webidl#1423 and #10516.

Now we need to audit all the call sites of the algorithms we've changed:

In most cases, the pattern is that another spec is using invoke (or friends), then catching exceptions and routing them to "report the exception". After the above changes, this can be simplified to just calling invoke; the exception will be reported automatically through the shared infrastructure.

If another spec is using invoke and not catching exceptions, this might be an unintentional bug. In that case, we can leave it as-is. Or it might be intentional that they want the exception to bubble out, or be caught with a different reaction than reporting the exception. In that case the spec will need to be updated to pass the new rethrowExceptions parameter.

If another spec is directly using "report the exception", it will need to be given the appropriate new arguments, and possibly update its return value handling.

Further work on nailing down the specifics

Tracked in #10514 for muting and to-be-filed for which global.

At this point we've got good infrastructure, centralized in HTML and Web IDL. But a few core specifics are still unclear:

  • Which global do we report exceptions to?
  • How does error muting work?

The way to resolve these involves writing web platform tests, and/or code inspection of browser codebases. There are likely interop problems, especially for muting as discussed downthread. A test suite to illustrate the problems is step 1; after that we can discuss the desired resolution.

Any work on this sub-task feels like a bonus to me. If we accomplish the mostly-editorial refactoring above, we can split this work into new issues.

@bzbarsky
Copy link
Contributor

The muted flag situation is hard: in reality the important thing is whether the script that threw the exception is muted, not the script that we're immediately invoking.

So if we load a cross-origin (muted) script with a function named f that throws, then have function g() { f() }, directly in our page, and pass g to a callback, then the exception ought to be muted, I would think. Not sure what UAs do here; in practice I suspect their muted exception stuff doesn't really match the spec's very well. For example, Gecko stores the muted state directly on the exception (and only for some types of exception values, afaict).

I'd be quite interested in what UAs do in practice with the muted error thing in various situations...

@domenic
Copy link
Member

domenic commented Jul 19, 2018

@TimothyGu pointed out that the confusion here, especially about which global to use, also applies to unhandledrejection.

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 4, 2020

in reality the important thing is whether the script that threw the exception is muted

To be clear, this is only important in the spec's current conception of muting. I in practice, for the sort of cases where you call a function that calls another function that throws, muting is pointless because you can just catch the exception and get all the info out of it.

One case where muting is really important, and the reason it was added, is during initial script execution. At that point the only thing on the stack is the cross-site script (assuming you're not just reporting a SyntaxError in it), and the page should not be able to extract information via "error" listeners that it can't get otherwise from the cross-site file.

Arguably muting is also relevant when browser code directly invokes callbacks from a cross-site script, for similar reasons...

Anyway, it would be good if the spec actually clearly defined muting, ideally in a way that does not involve dynamic introspection of the "scripted stuff" stack.... Defining it in terms of the way script is being entered, for example, would address the actual use cases without requiring that sort of introspection.

@pshaughn
Copy link
Contributor

It seems like it might be more consistent to only mute errors from compiling the script, and not from running it.

@bzbarsky
Copy link
Contributor

Consider a "script" that looks like this:

Doe,John,"Munitions expert"
Doe,Jane,"Demolition expert"

That will compile fine, thanks to the wonders of ASI and the comma operator, but leak "Doe" in the ReferenceError that then gets thrown. So you need to deal with muting inital-run errors to avoid this very common attack on CSV files.

@pshaughn
Copy link
Contributor

Thank you for the example.

@annevk
Copy link
Member

annevk commented Mar 28, 2020

One thing we should clarify about filename is that it's from before redirects. Additionally, if it's from before redirects we wouldn't need to mute it.

@evilpie
Copy link

evilpie commented Apr 7, 2020

One thing we should clarify about filename is that it's from before redirects. Additionally, if it's from before redirects we wouldn't need to mute it.

This ties into the current behavior in Firefox that I think we should specify. Firefox only exposes the filename/URL before any redirect. This means it should be okay to expose that URL even for cross-origin scripts. (You don't get any additional information, you can already tell which specifc script failed to load and you can obviously read .src)

This testcase shows that Firefox always uses the initial (pre-redirect) URL and doesn't censor it in script error events for cross-origin scripts. Chrome censors the filename and uses the final (post-redirect) URL.

@jeremyroman
Copy link
Collaborator

I've made a little headway here (spec code health rotation) according to @domenic's plan above. Mentioning this only to avoid duplicate work (not near ready).

jeremyroman added a commit to jeremyroman/html that referenced this issue Jun 7, 2024
This algorithm directly includes the error propagation and fallback
behavior, and requires callers to supply the global scope to be used,
rather than magically inferring it.

Call sites within HTML are replaced, but there is more work yet to be
done.
@domenic
Copy link
Member

domenic commented Jun 21, 2024

Here is an idea on how to reduce the hand-waving for line number, column number, URL, and script. And maybe muting, if muting is a property of script.

  • The JavaScript spec gets a host hook that is called whenever a throw completion is created. The host hook can attach data to the throw completion.
    • Alternate strategy: if attaching data to completion records or making room inside them for a [[HostDefined]] field is icky, we can store the data in a side table.
  • We can use that hook to store the appropriate script, line number, column number, and URL in the completion record. The appropriate script can probably be derived via GetActiveScriptOrModule(), and from it, the URL. The line and column number might need hand-waving. (Or maybe we can try to figure out the current Parse Node using some JavaScript spec machinery magic?)
    • Alternate strategy: no host hook. The JS spec just directly stores the ScriptOrModule + Parse Node on all throw completions.
  • We change some of the relevant infrastructure (most of which is in HTML, I think) to thread the whole abrupt completion to "report an exception", instead of just the exception value.

I think this is less of a priority than figuring out an interoperable story for muting that everyone agrees is ideal, though. It might be a useful technique for specifying that interoperable story, once we got there.

So to recap the current plan:

  • Fix the mess of the spec having two algorithms, the Web IDL integration, etc. @jeremyroman is making great progress. Keep hand-waving on URL, line number, column number. And, per my latest proposal in Create a 'report an exception' algorithm per #958 #10404 (review), start hand-waving on script and muting.

  • Write a bunch of WPT test cases for muting, as well as for URLs (filename). E.g. pre- or post-redirect.

  • Agree on what we want those test case results to be, then commit them as .tentative WPTs.

  • Figure out a spec for muting + URLs, and maybe lineno + colno, that matches those test cases. Maybe using something like what I describe above.

@annevk
Copy link
Member

annevk commented Jul 2, 2024

@ljharb @codehag @littledan @Ms2ger @syg thoughts on #958 (comment)? Would be nice to have exception handling detailed a bit better.

@ljharb
Copy link
Contributor

ljharb commented Jul 2, 2024

@annevk the proposed hook seems plausible to me as far as providing extra data, but any behavior hosts alter about error stack traces risks making https://github.com/tc39/proposal-error-stacks even more difficult to advance, and should be done very carefully.

@codehag
Copy link

codehag commented Jul 4, 2024

I haven't worked on this for a while so, please excuse any dumb thoughts here. cc @smaug---- because you may have some insights here.

  1. It sort of sounds like incumbent realm might be involved here, especially if the error comes from an on-click handler. In which case the target global is probably going to be the incumbent? In which case are async contexts going to be potentially useful for this?

  2. This is wobbly and I can be convinced otherwise but: "Alternate strategy: no host hook. The JS spec just directly stores the ScriptOrModule + Parse Node on all throw completions." -- feels weird because a) it affects all throw completions and b) this will make throw completions have additional fields compared to other completions, whereas now I think they are all the same per this table? The host hook is more annoying but I would worry about wide reaching effects and that is what I would want to check but don't have time for right now.

overall agree that we could spec this better. Would like to hear from @littledan and @syg and since I am probably not the best person to do a deep dive right now, cc @dminor for that potentially if he has time.

@Ms2ger
Copy link
Member Author

Ms2ger commented Jul 4, 2024

I appreciate the work being done here, but have no time to look in detail at the proposed changes, unfortunately.

@domenic
Copy link
Member

domenic commented Jul 25, 2024

Since we're coming close to completely a first milestone in the plan here, I'm starting to want to split this into multiple issues. I've created a new topic: error reporting label and broken out:

@domenic
Copy link
Member

domenic commented Jul 29, 2024

@jeremyroman, I'm happy to merge this once you file the two issues for the cases where the global interop is not clear.

Draft commit message:

Overhaul exception reporting algorithms

This carries out the first portion of the plan in https://github.com/whatwg/html/issues/958#issuecomment-226309871, by creating a new "report an exception" algorithm that replaces the previous "report an error" and "report the exception".

The new algorithm directly includes the error propagation and fallback behavior, and requires callers to supply the global scope to be used, rather than magically inferring it. It no longer takes a script, since in most cases the script was not rigorously determined. Follow-up work on determining the correct muting behavior (which the script argument was used for) is tracked in #10514.

All call sites within HTML are updated. #10516 tracks updating call sites in other specifications. In most cases the global used for HTML is now specified rigorously and matching implementations. TODO and TODO track the remaining cases with interop issues.

Closes #958, with the rest of the plan there tracked via the issues mentioned above, https://github.com/whatwg/webidl/issues/1423, and the https://github.com/whatwg/html/labels/topic%3A%20error%20reporting label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

9 participants