Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime.getContexts() proposal #358

Merged
merged 6 commits into from
Mar 29, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
233 changes: 233 additions & 0 deletions proposals/runtime_get_contexts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# New API: runtime.getContexts()

## Background / Summary
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

Chromium currently has the
[`extension.getViews()`](https://developer.chrome.com/docs/extensions/reference/extension/#method-getViews)
API method that allows an extension to get information about the "views" that
are active for the extension. A "view" in this context is any HTML frame in the
extension's process that commits to the extension origin; this may not be all
the extension owns, such as in the case of incognito-mode frames. The returned
values are a set of `HTMLWindow` objects, which the extension has permission to
reach directly into (as they are same-origin).

Some common reasons for calling this method are: determining if a toolbar
popup, tab, or options page is open; directly interacting with those pages by
reaching into their `HTMLWindow`; etc.

Chromium's implementation of Manifest V2 (MV2) allows for
[background pages](https://developer.chrome.com/docs/extensions/mv2/background_pages/)
to call this API. This is possible because these pages are themselves an
extension frame (albeit one that isn't visibly rendered) and run on the main
thread of the renderer; this allows them easy access to the JavaScript
[`Window`](http://go/mdn/API/Window#instance_properties) objects provided by
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
`extension.getViews()`.

## Problem

With the migration to Manifest V3 (MV3), background pages
[no longer exist](https://developer.chrome.com/docs/extensions/mv3/migrating_to_service_workers/);
instead, an extension's background context is
[service worker](https://developer.chrome.com/docs/workbox/service-worker-overview/)-based.
For technical reasons<sup>[1](#footnotes)</sup>, service workers cannot access
the [`Window`](http://go/mdn/API/Window#instance_properties) objects that
`extension.getViews()` provides and it is not feasible to implement that with
our browser design. Due to this, we cannot provide access to the JavaScript
context for these views, but we can allow an extension to query for them
(determining if they exist) and target them for messaging purposes.

## Solution

Considering the above situation, we'd like to propose a new extension API
method, `runtime.getContexts()`, to asynchronously provide metadata about
associated contexts that is still useful for an extension. This will allow
extension background scripts to identify the active contexts in the extension.
For example, this can be used to target messages to send using
[`runtime.sendMessage()`](https://developer.chrome.com/docs/extensions/reference/runtime/#method-sendMessage),
etc.). Introducing this API will allow an easier migration from MV2.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

This also obviates the need for introducing multiple one-off APIs, such as
separate APIs to query for the extension popup, offscreen documents, etc.

## API Proposal

This method will return an array of matching contexts, represented by a new
`ExtensionContext` type. This will be defined as:

```
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
runtime.Context = {
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
// Context type -- tab, popup, etc.
contextType: ContextType,
// A unique identifier for this context.
contextId: string,
// ID of the tab for this context, or -1 if this context is not hosted in a
// tab.
tabId: int,
// ID of the window for this context, or -1 if this context is not hosted in a
// window.
windowId: int,
// ID of the DOM document for this context, or undefined if this context is
// not associated with a document.
documentId?: string,
// ID of the frame for this context, or -1 if this context is not hosted in a
// frame.
frameId: int,
// The current URL of the document, or undefined if this context is not
// hosted in a document.
documentUrl?: string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used as a filter parameter, is it possible to use a match pattern for the value of ContextFilter.documentUrl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, no. I've thought about if we'd want to support a different filter object (so you could specify e.g. array of types, match patterns, etc), but I wasn't sure if it was worth it. It seems like you'd normally either a) be able to specify an explicit URL or b) identify the contexts in different ways (say, by context type).

If folks feel strongly that we should support a richer filter from the beginning, I'm amenable to introducing a new ContextFilter type. Otherwise, we could potentially do this in the future (by supporting choices for the appropriate fields). Lemme know what y'all think.

// The current origin of the document, or undefined if this context is not
// hosted in a document.
documentOrigin?: string,
Comment on lines +78 to +83
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do documentUrl and documentOrigin refer to the document that the extension context is inside (e.g. top level main frame of a tab), the extension context itself, or something else? Depending on the answer, I may have other questions.

The "Sandboxed Pages" section notes that this method will not return sandboxed pages as they have "a separate origin ("null") and do not have access to extension APIs." I find this somewhat surprising as I had assumed that the documentOrigin property to the origin of the extension context and therefore this property would either match the extension's origin (i.e. the value returned by evaluating new URL(browser.runtime.getURL("")).origin) for normal extension pages or null for sandboxed pages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do documentUrl and documentOrigin refer to the document that the extension context is inside (e.g. top level main frame of a tab), the extension context itself, or something else? Depending on the answer, I may have other questions.

They refer to the document the extension context is executing in, so it depends on the context:

  • Anything frame-based: correspond to the extension (e.g. chrome-extension:///popup.html)
  • Service workers: undefined
  • Content scripts: document the script is injected within.

I've added a new section on this in "Additional Considerations"

The "Sandboxed Pages" section notes that this method will not return sandboxed pages as they have "a separate origin ("null") and do not have access to extension APIs." I find this somewhat surprising as I had assumed that the documentOrigin property to the origin of the extension context and therefore this property would either match the extension's origin (i.e. the value returned by evaluating new URL(browser.runtime.getURL("")).origin) for normal extension pages or null for sandboxed pages.

This was inline with Rob's suggestion on the old PR to not include sandboxed pages as contexts here, which I agree with (they are reasonably separate from the rest of the extension). We could always change that in the future by adding a new context type.

// Whether the context is for an incognito profile.
incognito: boolean,
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
}
```

`ContextType` will indicate the type of context retrieved. It is an enum
defined as:

```
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
extension.ContextType = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enum meant to be exhaustive? Can other browsers extend it? For example, a number of browsers support sidebars. Some extensions for Firefox and Vivaldi currently struggle with messaging with these contexts (since sidebars do not have a meaningful tabId and frameId for messaging. Developers can create persistent ports, but it is ugly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with all APIs discussed in this community group, it's non-binding, and browser vendors can deviate when it makes sense. If a browser supports a different type of context, they can add it in here, and if they don't support a given context, they can remove the enum entry.

FWIW, Chrome is also going to support side panels, so we will add a SIDE_PANEL context type when that happens. (I was tempted to include it here, but didn't want to confuse the proposal by having unshipped concepts).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdcronin Could you list all types that you're thinking of for Chrome? If you'd like, prefaced by a comment // (Chrome-specific), so that we can see where we could converge towards a common type if desired.

E.g. I would expect at least devtools and devtools panel to have their distinct type here (as these do not fit in any of the other existing categories).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the call-out! Devtools are interesting -- in Chromium, they commit to a different origin (even though from the extension's POV, they are just the extension page). They are also not currently included at all in extension.getViews(). I'd like to tackle these separately, but I've added an explicit section for them in future work.

I also added in a type for SIDE_PANEL (though it's still under development in Chrome)

// Tabs the extension is running in.
TAB: 'TAB',
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
// Toolbar popups created by the extension.
// TODO: Should this be `TOOLBAR_POPUP` to avoid ambiguity with web popup
// windows? Or perhaps `ACTION_POPUP` to avoid tying it to a particular UI
// surface?
POPUP: 'POPUP',
// The background context for the extension (in Chromium, the extension
// service worker).
BACKGROUND: 'BACKGROUND',
// Offscreen documents for the extension.
OFFSCREEN_DOCUMENT: 'OFFSCREEN_DOCUMENT',
};
```

This enum will be expanded in the future as more context types are added.

The method signature will be defined as:

```
runtime.getContexts(
filter?: ContextFilter
): Promise<ExtensionContext[]>;
```

The `filter` argument will be used to filter down to a particular type of
context. It will share the same properties as `ExtensionContext`, but will
have all fields be optional. Any omitted field matches all available contexts.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

### Additional Considerations

#### Context ID

Each extension context will have a unique context ID, represented by a string.
This is necessary to uniquely identify a context, since other fields may be
non-unique (such as URL) or absent (such as documentId).

Like `documentId`s, the extension `contextId` will update on (non-same-page)
navigation.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

#### Incognito mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section currently describes split mode but does not directly address spanning mode. It seems reasonable to assume that in spanning mode getContexts() will return both private and non-private contexts, but I'd prefer to be explicit about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spanning mode is... weird.

Try embedding an iframe in an incognito window in a spanning mode extension: you get a pseudo-split-mode-but-not-really extension.

I didn't really want to spec out what happens here. I've taken a page from the fetch API and said "it's left as an exercise to the implementor."


[Split-mode](https://developer.chrome.com/docs/extensions/mv3/manifest/incognito/#split)
extensions will _not_ have access to the contexts from their corresponding
profile. That is, the incognito extension process will not be able to access
contexts from the non-incognito extension process, and vice versa.

#### Sandboxed Pages

Sandboxed pages will not be included in the returned contexts. They are
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
a separate origin (`"null"`) and do not have access to extension APIs.

#### TOCTOU

Time-of-Check vs Time-of-Use (TOCTOU) issues are an unavoidable aspect of this
API, given its asynchronous nature. An extension may call `getContexts()` and,
by the time it receives the result, the values (such as available contexts or
URLs of those contexts) may be different.

This is something extension developers already need to worry about with any
asynchronous API. API calls should handle references to non-existent contexts
gracefully, and extension developers can leverage concepts like `documentId`
(which is updated when a frame navigates, allowing extensions to identify if a
given context is the same as when they queried it).

#### Naming

* *ContextType*: While there already exist more "Context" references in APIs
(such as `ContextType` on the `contextMenus` API), the namespace (`runtime`)
helps differentiate these.
* *`documentUrl` et al*: We use `documentUrl` on contexts (as opposed to
`url`) to indicate the URL is that of the document associated with the
context. This is relevant for cases where there may not be a document URL
(such as service workers) and allows for potential future expansion where
other URLs (such as script URLs) may exist.

#### Default / Absent Values

There is an inconsistency in how we represent a value for a context that doesn't
have the associated trait, such as a `tabId` or `documentId` for an extension's
background service worker context (which has neither an associated tab nor
document). Some of these values -- `tabId`, `windowId`, and `frameId` -- use
constant values to indicate "no state", such as `-1` for `tabId`. Others --
`documentId`, `documentUrl`, and `documentOrigin` -- use undefined to indicate
this.

This is an artifact of existing APIs and precedence. Since many existing APIs
use the constant integer values, we want to be consistent with those. However,
for newly-introduced fields, we use the more intuitive `undefined` state.

## Future Work
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

### Messaging APIs Support `ContextId`s as Target

In practice, many extension messages are meant for a single target, but are
broadcast to all extension contexts. With the ability to uniquely identify a
single extension context, we will modify messaging APIs (such as
`runtime.sendMessage()` and `runtime.connect()`) to allow specifying specific
targets that should receive a message.

### Multi Page Architecture Fields (`DocumentLifeCycle`, `FrameType`, and `parentDocumentId`, and `parentFrameId`)
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

[Multi Page Architecture](https://docs.google.com//1NginQ8k0w3znuwTiJ5qjYmBKgZDekvEPC22q0I4swxQ#heading=h.w1qo2n6sr8wn)
caused
[multiple changes](https://developer.chrome.com/blog/extension-instantnav/) to
Chromium's
[tabs API](https://developer.chrome.com/docs/extensions/reference/tabs/),
[scripting API](https://developer.chrome.com/docs/extensions/reference/scripting/),
and
[web navigation API](https://developer.chrome.com/docs/extensions/reference/webNavigation/).

Among these changes are the additions of `DocumentLifecycle`, `FrameType`, and
`parentDocumentId`. If there is sufficient demand, we can consider adding these
fields to the `ExtensionContext` type.

### runtime.getCurrentContext()

We would like to provide an additional API, `runtime.getCurrentContext()`, to
return the calling context.

### Content Script Contexts

[Content scripts](https://developer.chrome.com/docs/extensions/mv3/content_scripts/)
run in a separate
[Renderer](https://developer.chrome.com/blog/inside-browser-part3/) (and
process) from the extension process. In this first version of the API, we will
not include these contexts due to the complexity it entails. However, we would
like to add these contexts in the future.

With the content script additions, we may add new fields to `ExtensionContext`,
such as `scriptUrl` (to indicate the content script's source).

## Footnotes

<sup>1</sup>: Non-main threads in a
[Renderer](https://developer.chrome.com/blog/inside-browser-part3/) (where
service workers run in Chromium) cannot access DOM concepts directly (they are
only accessible from the main renderer thread). Service workers thus cannot
synchronously access the JavaScript
[`Window`](http://go/mdn/API/Window#instance_properties) objects provided by
`extension.getViews()`. Supporting this access would take engineering years to
change, and is likely undesirable due to the complexity and considerations it
would introduce (threading and locking, slowing down main thread execution).