cross-origin iframe load events vs window.postMessage #4730

bzbarsky · 2019-06-24T22:39:59Z

Consider this testcase:

<script>
  onmessage = function(e) {
    console.log("MESSAGE: " + e.data);
  }
</script>
<iframe src="some-url"
        onload="console.log('LOADED')"></iframe>

where the some-url file looks like this:

<script>
  onload = function() {
    parent.postMessage("hello", "*");
  }
</script>

What should happen? What I observe in browsers, including Chrome in site-per-process mode with the URLs on different sites, is that "LOADED" is logged before the message event listener fires in the parent document. But per spec, what should happen?

We land in https://html.spec.whatwg.org/multipage/parsing.html#the-end for the child document.
Step 7 queues a task to fire the load event on that child document.
Step 12 queues a task to mark the child document as completely loaded.
Task from (2) runs, calls postMessage, queues a task for the message event.
Task from (3) runs, because of https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:iframe-load-event-steps-2 calls into https://html.spec.whatwg.org/multipage/iframe-embed-object.html#iframe-load-event-steps and fires the load event on the iframe element.
The message event fires.

That matches the browser behavior, but step 5 is a process-crossing mess in site-per-process mode, no? In particular, the task from (3) runs in the child's process, but then the load event needs to fire in the parent process. If an async message to do that is sent at that point, it will lose the race to the message event that's already been queued up.

So how does Chrome manage its behavior here in this case? Is it explicitly queueing multiple tasks, on different processes, at https://html.spec.whatwg.org/multipage/parsing.html#the-end step 12 instead of queuing a task in one process that will then send a message to another process?

@annevk @mystor @smaug---- @csreis @Rnia @zetafunction

The text was updated successfully, but these errors were encountered:

csreis · 2019-06-24T23:38:07Z

I haven't had time to look in depth, but I think this is because Chrome posts a task in the sender's renderer process, which then sends the IPC to the browser process to be delivered to the correct renderer process. This was to preserve ordering when a postMessage was sent and then something like focus() was supposed to happen before the message arrived, per this change from alexmos:
https://chromium-review.googlesource.com/c/chromium/src/+/1012472/

bzbarsky · 2019-06-25T00:01:08Z

@csreis interesting! That commit definitely talks about this case, but also looks like it changed the behavior described in #3506 (comment) -- now Chrome is claiming true even with site-per-process there.

Did Chrome also change the behavior of MessageChannel postMessage, or just Window postMessage? See https://bugzilla.mozilla.org/show_bug.cgi?id=1440754#c1

It would be really good to get this actually specced in a sane way instead of UAs coming up with different probably-incompatible workarounds...

csreis · 2019-06-26T00:13:00Z

Sorry, I don't know whether that change affected MessageChannel, and alexmos is OOO at the moment. @zetafunction, do you know? I imagine we would want to update MessageChannel if not.

I agree that it would be useful to find a consistent way to define how this works.

smaug---- · 2019-06-26T09:33:38Z

FWIW, MessagePort.postMessage and window.postMessage do not use the same task sources, so at least in Gecko they are handled differently.

rniwa · 2019-06-27T05:46:00Z

@cdumez : what is/was our plan here??

annevk · 2019-06-28T14:14:24Z

Can we do the same thing as I proposed in the other issue? Keep a queue of things that need to go across the boundary. Move them across the boundary end-of-task, and then process the queue on the other side when it arrives?

gterzian · 2019-07-02T16:32:18Z

Keep a queue of things that need to go across the boundary. Move them across the boundary end-of-task, and then process the queue on the other side when it arrives?

You could do that, and that would require "enqueuing" the cross-process operation of firing the load event on the iframe(in the parent) at Step 12 of the-end(of the child), since if you wait until the task that marks the child document as fully loaded, the postMessage call that occurs in the task firing the load event on the child doc will have already "enqueued" one such cross-process operation first.

Reading https://chromium-review.googlesource.com/c/chromium/src/+/1012472/, looks like that problem was solved by having the postMessage call queue a task on the "local" event-loop, effectively spinning it, and then only sending the ipc message from that subsequent task.

I think the problem can be described more generally as: you can't fire events in another process "sync" from within a task, you need to enqueue some sort of ipc message(and probably enqueue a task on an event-loop to do the actual DOM manipulation, upon receipt of that ipc-message in some sort of router thread separate from an event-loop) which effectively adds an event-loop tick to the operation, meaning that steps that are now spec-ced as happening "in the same task", end-up gaining a tick due to the ipc(and the task queuing upon receipt in the other process), and loosing their ordering advantage versus operations like postMessage where the event is fired from a subsequently queued task.

Example: iframe.contentWindow.focus() should run the "window focus steps" immediately, and fire the "focus" event in the same task. If the Iframe is in another process, you have to send an ipc message, and queue a task on the event-loop in the other process.

Or, marking the document of an iframe as fully loaded should fire the "load" event on the containing iframe element, in the same task that marks the document as fully loaded. If the child document is running in a different process however, you need to communicate with the process of the parent, and effectively queue a task to fire the "load" event on the iframe element.

So both operations, in a cross-process scenario, find themselves loosing their ordering advantage versus an operation like postMessage, which in the same-process scenario already involves firing the event in a subsequent task, not the same task where the call to postMessage occurs.

And in terms of implementations, I assume that receiving an ipc-message is not done "on the event-loop" directly, rather it happens in a in-process ipc-router thread, which then has to queue a task on an event-loop in response to receiving the ipc-message, since running "on the event-loop" steps on such a router thread would break the processing model of the event-loop, because of the parallel task execution. In other words, ipc-communication I think implies queuing a task on an event-loop in the receiving process(hence the loss of ordering, since an additional task is introduced on the receiving end).

I think the problem could perhaps be solved by re-using the parallel-queue concept. Where event-loop A would enqueue-the-following-steps consisting of enqueuing a task on event-loop B, with the task performing the operation that would normally be performed "sync" in a task on event-loop A(but now require a cross-process operation running on event-loop B).

(What I like about a parallel queue is that while it could involve crossing process boundaries, it doesn't have to, or it could cross more than one boundary, leaving plenty of flexibility with regards to the level of process isolation to UAs. Also, enqueuing steps doesn't mean they need to be handled immediately, just eventually and in order, leaving plenty of space to UA's to "wait" until layout finishes and what not, which seems to be a consideration discussed in https://chromium-review.googlesource.com/c/chromium/src/+/1012472/)

Roughly Something like:

Let task be Steps {} of the currently running task, and let "this" Window be null. Abort the currently running task.
let remoteWindowProxyId be the identifier for the windowproxy corresponding to the cross-origin window.
Enqueue the following steps to the cross-event-loop-queue(a unique parallel queue):
1. Let browsing context be null.
2. If there is a browsing context corresponding to remoteWindowProxyId, set BrowsingContext to it.
3. let "this" window be the Window corresponding to the currently active document of Browsing context.
4. Enqueue task using the "cross-event-loop-task-source" to the window event loop corresponding to the event-loop of the similar-origin window agents where window is currently to be found.

So the idea is that you take certain steps from the currently running algorithm, you stop the currently running task, send an ipc-message containing enough info to continue those steps on a different event-loop, then you receive that ipc-message somewhere either directly in the process where the cross-site window is running(or you might have some central broker re-routing the message), but likely on a different thread, some sort of ipc-router thread, and then from that thread you enqueue a task on the relevant event-loop(which by now is in the same process), and that task consists of running those steps from the original task that you suspended in the other process.

So for example in the case of a cross-origin window.postMessage, you'd let task be the window-post-message-steps, and you would enqueue steps on the ipc-queue that would set "this" window to the "right" one, and would enqueue "task" on the relevant event-loop and run task(effectively the "window-post-message-steps") in the context of "this" window.

And you'd retain ordering, since the task running the "window-post-message-steps" would itself queue a task for fire the message event, whereas a task running something like the "window focusing steps" would not. And all cross-process operation would incur an additional queued task on the receiving end, consistently.

Also, given that the iframe-load-in-progress should be set on the child document when it's marked as fully loaded, and then unset when the load event would have fired, it looks like that part would require sending an ipc-message back to the process of the child in order to unset the flag.

fred-wang mentioned this issue Jun 28, 2019

Algorithms triggered by user activation and window.postMessage / MessageChannel / BroadcastChannel #4741

Closed

gterzian mentioned this issue Nov 12, 2019

MessagePort underspecified? #5078

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cross-origin iframe load events vs window.postMessage #4730

cross-origin iframe load events vs window.postMessage #4730

bzbarsky commented Jun 24, 2019

csreis commented Jun 24, 2019

bzbarsky commented Jun 25, 2019

csreis commented Jun 26, 2019

smaug---- commented Jun 26, 2019

rniwa commented Jun 27, 2019

annevk commented Jun 28, 2019

gterzian commented Jul 2, 2019 •

edited

Loading

cross-origin iframe load events vs window.postMessage #4730

cross-origin iframe load events vs window.postMessage #4730

Comments

bzbarsky commented Jun 24, 2019

csreis commented Jun 24, 2019

bzbarsky commented Jun 25, 2019

csreis commented Jun 26, 2019

smaug---- commented Jun 26, 2019

rniwa commented Jun 27, 2019

annevk commented Jun 28, 2019

gterzian commented Jul 2, 2019 • edited Loading

gterzian commented Jul 2, 2019 •

edited

Loading