Replacing `navigationId` with `getRelevantNavigation()` #12

tunetheweb · 2023-01-24T12:59:55Z

As discussed in #10 the navigation details are needed to get both the startTime of the navigation (so the metrics can be reported relative to the soft navigation time), and the URL (so the events can be used to record analytics for the appropriate URL rather than the current one).

At present this is achieved with the new navigationId attribute of the event, which can then be mapped to the navigation details in one of two ways:

By using the PerformanceEntry API.
By recording the soft-navigation entries via a PerformanceObserver and using this array to look up the event.

The first option is easier, but depends on making certain assumptions on the navigationId (for example >= 2 means it's a soft-navigation) and looking up based on this:

navEntry = performance.getEntries("soft-navigation")[navigationId - 2]
           || performance.getEntries("navigation")[0]

"Magic numbers" are generally to be avoid on the web platform, and this assumption may then be broken in the future if new navigation entries appear.

A perhaps better suggestion is to replace the navigationId with a getRelevantNavigation() method on the events that provides direct access to the associated navigation entry to avoid making any of those assumptions.

If this direction is taken, we may need to provide an overlap period where both navigationId and getRelevantNavigation() both exist to give anyone experimenting with this API time to migrate to this new proposal.

The text was updated successfully, but these errors were encountered:

yoavweiss · 2023-01-24T13:03:57Z

^^ @clelland @noamr @mmocny

clelland · 2023-01-24T16:01:42Z

Agreed that navigationId has some ergonomics issues. You probably shouldn't be assuming that it matches array index here -- it's a counter right now, but there had been talk of making it a UUID instead. Also, with the possibility of bfcache restorations in addition to SPA navigations, you might find a particular navigation id in either buffer.

With navigationId, I think that the future-proof way of getting an associated navigation timeline entry would be to get (or accumulate with a performanceObserver, if it might overflow the buffer) navigation entries of all types, and then scan them for an entry with a matching ID.

var navs = []
new performanceObserver(entries => {
  navs.push(entries.getEntries());
}).observe(types: ['navigation','soft-navigation','back-forward-cache-restoration']);

navEntry = navs.find(el => navigationId == el.navigationId);

getRelevantNavigation is better, but I'm worried that it's not going to be performant when your goal is to cluster, say, all resource timing entries by their navigation -- or to get all performance entries associated with a given navigation. In those cases, you'd have to call getRelevantNavigation on each entry in turn and cache the results.

What about just making navigation a property on each timeline entry, being a reference to the appropriate navigation timeline entry object? That would give you direct access to it if you need it, and you can also use object equality testing to group related entry objects by navigation, when you want to do that.

tunetheweb · 2023-01-25T12:55:50Z

With navigationId, I think that the future-proof way of getting an associated navigation timeline entry would be to get (or accumulate with a performanceObserver, if it might overflow the buffer) navigation entries of all types, and then scan them for an entry with a matching ID.

This is quite annoying to get and link up to the performance events (FCP, LCP...etc.). We need to observe the navigations, store them, then process the performance events to match them, then potentially finalise some of them based on the new navigations we saw earlier. It also feels like we're moving the complexity of this from the browser, to the developer. So the browser memory and buffers are handled better, but the application memory and browser buffers are potentially much worse! I feel like this should be handled by the browser to take away that pain and ensure it's implemented consistently.

getRelevantNavigation is better, but I'm worried that it's not going to be performant when your goal is to cluster, say, all resource timing entries by their navigation -- or to get all performance entries associated with a given navigation. In those cases, you'd have to call getRelevantNavigation on each entry in turn and cache the results.

What about just making navigation a property on each timeline entry, being a reference to the appropriate navigation timeline entry object? That would give you direct access to it if you need it, and you can also use object equality testing to group related entry objects by navigation, when you want to do that.

A navigation property works for me too!

noamr · 2023-01-25T12:59:46Z

getRelevantNavigation is better, but I'm worried that it's not going to be performant when your goal is to cluster, say, all resource timing entries by their navigation -- or to get all performance entries associated with a given navigation. In those cases, you'd have to call getRelevantNavigation on each entry in turn and cache the results.

What about just making navigation a property on each timeline entry, being a reference to the appropriate navigation timeline entry object? That would give you direct access to it if you need it, and you can also use object equality testing to group related entry objects by navigation, when you want to do that.

What's the difference between the two? We can choose either API based on ergonomics and choose whether we cache/store the result or not based on performance

clelland · 2023-01-26T14:27:14Z

I think the difference is primarily the overhead of a function call any time you want to compare to performance entries, to see if they are part of the same navigation. With either navigationId or navigation, you can just compare the values directly; with getRelevantNavigation you need to do two extra function calls per comparison.

noamr · 2023-01-26T15:13:07Z

I think the difference is primarily the overhead of a function call any time you want to compare to performance entries, to see if they are part of the same navigation. With either navigationId or navigation, you can just compare the values directly; with getRelevantNavigation you need to do two extra function calls per comparison.

The function can return a cached value, and OTOH entry.navigation can be a getter function. There is no perf overhead connected with the shape of the API here.

paulirish · 2023-02-17T23:27:08Z

-1 on exposing a navigation object. I've definitely come to expect all perfEntries are serializable and standalone. Referencing any parent/child relationship feels weird, and we'd have to 'clean' entries before serializing them to beacon.

I don't think the developer ergonomics here are much of a problem..

Here's my quick take on organizing all entries according to navigationid..

// collect all the things.
const allEntries = [];
const obs = new PerformanceObserver(list => allEntries.push(...list.getEntries()));
for (const type of PerformanceObserver.supportedEntryTypes) {
  obs.observe({type, buffered: true, includeSoftNavigationObservations: true});
}

const navigationEntries = allEntries.filter(e => ['navigation','soft-navigation','back-forward-cache-restoration'].includes(e.entryType));

const navigationIdToEntries = allEntries.reduce((map, curr) => map.set(curr.navigationId, [...map.get(curr.navigationId) || [], curr]), new Map());

And in action:

And if you want the pageUrl to associate... something like:

const pageUrlToEntries = [...navigationIdToEntries.entries()].map(([navId, entries]) => ([navigationEntries.find(e => e.navigationId === navId).name, entries]));

noamr · 2023-02-18T04:43:32Z

@paulirish I think that's a good argument.
We could still have a better solution than a magic number though:

A UUID, with perhaps an empty string or "root" or something for the first navigation
Saving the navigationTime which would match the startTime of the navigation entry (and perhaps expose getEntriesByStartTime(startTime, types)?
Be explicit about what that number mean, even call it an index instead

mmocny · 2023-02-19T15:51:08Z

GetEntriesByStartTime doesn't work for event timing where start isn't the time origin (same for user timings). But I like that direction a lot. Thanks for the feedback Paul.

…

On Fri, Feb 17, 2023, 22:43 Noam Rosenthal ***@***.***> wrote: @paulirish <https://github.com/paulirish> I think that's a good argument. We could still have a better solution than a magic number though: - A UUID, with perhaps an empty string or "root" or something for the first navigation - Saving the navigationTime which would match the startTime of the navigation entry (and perhaps expose getEntriesByStartTime(startTime, types)? — Reply to this email directly, view it on GitHub <#12 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADTZKXP5JEIZU34SFXWJWLWYBHP5ANCNFSM6AAAAAAUFBY474> . You are receiving this because you were mentioned.Message ID: ***@***.***>

tunetheweb · 2023-02-27T16:14:00Z

I still feel that, while it's possible, to observe the individual events as per @paulirish 's code above, it still moves a lot of the onus on to the developer.

At the moment that code builds an ever-increasing allEntries array and we're leaving it to the developer to decide if/when to clear that down to prevent what is basically a memory leak. That feels like something we should be discouraging, rather than encouraging. And if it's too difficult for the browser to track this appropriately, then why do we feel developers will do this better?

Contrast that to most current performance observer methods usage IMHO (for example, in web-vitals.js library) where the observation is dealt with and then no longer needs to be referenced anymore so it's discarded. The observation includes everything needed to deal with it appropriately (with some small exceptions like INP, where the store the last 10 interactions - specifically to reduce memory limit).

With some SPAs being very long lived, and potentially interaction-heavy, the memory implications of storing all the old observations is not small.

noamr · 2023-03-01T05:24:54Z

I have another suggestion:

Use the name property of the soft navigation entry. It would be soft-navigation-${navStartTimestamp}
Give the other entries a navigationName attribute referring to it
Get the navigation using getEntriesByName()
If we really want to, we can still have getRelevantNavigation() helper that does this, but it does not replace doing this with plain objects.
WDYT?

yoavweiss · 2023-03-14T13:59:00Z

@clelland - thoughts on @noamr's suggestion?

clelland · 2023-04-18T20:18:34Z

Catching up here --

I expect we'd need to do a similar thing with back-forward-cache restores; with a name of back-forward-cache-restore-${timestamp}, and use it the same way. We'd be creating an implicit requirement that any future 'navigation-ish' entries have a name defined.

The problem, I expect, is that you'd have to special-case the navigationName for entries relating to the initial (hard) navigation. We can't use this scheme for the performancenavigationtiming entry's name, since that's already defined to be the URL of the current page. (And I'm not sure how likely it is that other entries -- resourcetiming, e.g. -- will have the same name). I think we'd need to have to leave it out, or set it to an empty string, or use some other flag that directs you to not use getEntriesByName, but instead getEntriesByType('performanceNavigationTiming')[0]

clelland · 2023-04-19T20:04:58Z

I think that the main thing @noamr's suggestion in #12 (comment) allows is the reuse of getEntriesByName, rather than having to expose a new API (previously proposed as getEntriesByStartTime). The trouble is that we can't uniformly use name, since it's already defined for PerformanceNavigationTiming objects.

At the same time, @tunetheweb and @paulirish's point that referencing actual timeline event objects easily leads to memory leaks, and changes the independent nature of the objects, is well-taken.

If we just remove the requirement that the navigation id match the name property of some other entry, then it's about equivalent to the suggestion of making the navigation id into a UUID (or something slightly more semantic, like ${entry-type}-${timestamp}). A getEntriesByNavigationId would be ~ as much new API surface as getEntriesByStartTime.

yoavweiss · 2023-04-26T09:39:47Z

getEntriesByNavigationId would also have the added ergonomic advantage of grabbing all the relevant entries for a certain navigation from the timeline.

tunetheweb · 2023-09-06T12:03:09Z

Just to update on this. This is the code I'm suggesting in the article (and is basically what I have implemented in web-vitals):

const softNavEntry =
  performance.getEntriesByType('soft-navigation').filter(
    (entry) => entry.navigationId === navigationId
  )[0];
const hardNavEntry = performance.getEntriesByType('navigation')[0];

const navEntry = softNavEntry || hardNavEntry;

const pageUrl = navEntry?.name;
const startTime = navEntry?.startTime;

This is:

reasonably ergonomic
is backword compatible with the original integer and the newer UUID/string implementation
let's the browser handle the entries (so avoiding the memory leak issue of developers not considering that)

The one downside is that getEntriesByType is limited by the max buffer size, but since we should be processing the entries as they come in, I'm not sure that's a real issue. And the reason they are limited is presumably to avoid memory leak issue anyway so the alternative of observing these and adding to an unlimited entry is not appealing to me!

Also this doesn't use the original hard navigation UUID (though it could, but would need an extra fallback anyway).

I'm not sure we need a getEntriesByNavigationId to be honest - I can't really think of a use for it.

A getNavigationEntryById to wrap the above code might be useful however. It's obviously not a lot of code, but would allow this to be more easily expanded (e.g. if/when back/forward navigations come along) rather than each implementation knowing which types to include. Then again, they likely need to know which "navigation" types they handle so maybe that's less of an issue.

mmocny · 2023-09-06T13:07:26Z

but since we should be processing the entries as they come in, I'm not sure that's a real issue

Because there could be a new Perf observer from any new script injected at any time, buffers are never flushed or reset, so it really doesn't matter when you start observing. If there is a chance of filling, they will fill and then the only way to get the data is to have a PerformanceObserver already registered.

the reason they are limited is presumably to avoid memory leak

Actually buffer limits are there because this is memory which is allocated on each and every page load even in cases where there is never any observer. So, we limit the amount of resource dedicated to this speculative feature. As soon as an Observer is registered, we can save more data, but the global buffer doesn't actually grow when that happens (since its a static array), instead it gets places into FIFO queue for each observer.

I'm not sure we need a getEntriesByNavigationId
A getNavigationEntryById to wrap the above code might be useful however

Interesting. Besides the ergonomics, I guess the buffering of entries does imply you should already be observing them and doing a reverse lookup as you suggest...

yoavweiss · 2023-11-10T13:22:41Z

I think we can close this in favor of w3c/performance-timeline#182

clelland mentioned this issue Apr 18, 2023

Add navigationId to PerformanceEntry w3c/performance-timeline#192

Merged

clelland mentioned this issue Jun 9, 2023

API shape for monitoring new navigation-like entries such as App History, BFCache, etc w3c/performance-timeline#182

Open

yoavweiss closed this as completed Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing `navigationId` with `getRelevantNavigation()` #12

Replacing `navigationId` with `getRelevantNavigation()` #12

tunetheweb commented Jan 24, 2023 •

edited

Loading

yoavweiss commented Jan 24, 2023

clelland commented Jan 24, 2023

tunetheweb commented Jan 25, 2023

noamr commented Jan 25, 2023 •

edited

Loading

clelland commented Jan 26, 2023

noamr commented Jan 26, 2023 •

edited

Loading

paulirish commented Feb 17, 2023

noamr commented Feb 18, 2023 •

edited

Loading

mmocny commented Feb 19, 2023 via email

tunetheweb commented Feb 27, 2023

noamr commented Mar 1, 2023

yoavweiss commented Mar 14, 2023

clelland commented Apr 18, 2023

clelland commented Apr 19, 2023

yoavweiss commented Apr 26, 2023

tunetheweb commented Sep 6, 2023

mmocny commented Sep 6, 2023

yoavweiss commented Nov 10, 2023

Replacing navigationId with getRelevantNavigation() #12

Replacing navigationId with getRelevantNavigation() #12

Comments

tunetheweb commented Jan 24, 2023 • edited Loading

yoavweiss commented Jan 24, 2023

clelland commented Jan 24, 2023

tunetheweb commented Jan 25, 2023

noamr commented Jan 25, 2023 • edited Loading

clelland commented Jan 26, 2023

noamr commented Jan 26, 2023 • edited Loading

paulirish commented Feb 17, 2023

noamr commented Feb 18, 2023 • edited Loading

mmocny commented Feb 19, 2023 via email

tunetheweb commented Feb 27, 2023

noamr commented Mar 1, 2023

yoavweiss commented Mar 14, 2023

clelland commented Apr 18, 2023

clelland commented Apr 19, 2023

yoavweiss commented Apr 26, 2023

tunetheweb commented Sep 6, 2023

mmocny commented Sep 6, 2023

yoavweiss commented Nov 10, 2023

Replacing `navigationId` with `getRelevantNavigation()` #12

Replacing `navigationId` with `getRelevantNavigation()` #12

tunetheweb commented Jan 24, 2023 •

edited

Loading

noamr commented Jan 25, 2023 •

edited

Loading

noamr commented Jan 26, 2023 •

edited

Loading

noamr commented Feb 18, 2023 •

edited

Loading