Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding finalResponseHeaders{Start|End} times #345

Open
noamr opened this issue Aug 31, 2022 · 37 comments
Open

Consider adding finalResponseHeaders{Start|End} times #345

noamr opened this issue Aug 31, 2022 · 37 comments
Assignees

Comments

@noamr
Copy link
Contributor

noamr commented Aug 31, 2022

Since we've added early hints, now TTFB in RUM, which maps to responseStart means something slightly different than it did before (though technically it represents the same thing).

For services that rely on final response timing, perhaps we should also expose when the final headers started arriving, and when they ended and the body started?

@LPardue
Copy link
Contributor

LPardue commented Aug 31, 2022

What is the definition of final that we're using here?

From the HTTP RFC, a server can respond to a single request with multiple interim responses (1xx) and a single final response messages (2xx through 5xx)
; see https://www.rfc-editor.org/rfc/rfc9110.html#section-15.2-3

A client MUST be able to parse one or more 1xx responses received prior to a final response, even if the client does not expect one. A user agent MAY ignore unexpected 1xx responses.

And do we care about measuring the arrival times of trailer field sections on the final response message?

@noamr
Copy link
Contributor Author

noamr commented Aug 31, 2022

What is the definition of final that we're using here?

From the HTTP RFC, a server can respond to a single request with multiple interim responses (1xx) and a single final response messages (2xx through 5xx) ; see https://www.rfc-editor.org/rfc/rfc9110.html#section-15.2-3

This definition of final :)
The last response, which is the first response that's not a 1xx or a 3xx

And do we care about measuring the arrival times of trailer field sections on the final response message?

Perhaps this is a separate conversation, since trailers are not currently widely supported.

@LPardue
Copy link
Contributor

LPardue commented Aug 31, 2022

This definition of final :)
The last response, which is the first response that's not a 1xx or a 3xx

Great! I support add such timing markers.

Perhaps this is a separate conversation, since trailers are not currently widely supported.

Happy for that to be spun off

@colinbendell
Copy link

We should also consider also finalResponseBodyStart time to distinguish header-end and the data payload start

@nicjansma
Copy link
Contributor

Some points from the 9/1 W3C WebPerf call:

  • Could also be useful to know the timing difference between finalResponseHeadersEnd and another new field like finalResponseBodyStart, AKA if headers are flushed early, when does the response body content begin? This could help with both Early Hint and regular transfer cases.
  • Should be TAO protected, like responseStart
  • One question to resolve: Clarify how PerformanceTiming.responseStart relates to Early Hint responses navigation-timing#181
  • Noam is looking for any additional feedback or concerns

@tunetheweb
Copy link
Member

tunetheweb commented Sep 1, 2022

To @colinbendell 's point in #181, instead of adding finalXXX timings, we could add the following:

  • interimResponseStart (between secureConnectionStart and responseStart) and change responseStart to mean final response start, as many may have assumed it has until now.

And then also add:

  • responseHeadersEnd (just after responseStart)
  • responseBodyStart (just before responseEnd)

Only thing I don't like about that is we don't have the equivalent opposites (e.g. responseHeadersStart which is basically responseStart) and responseBodyEnd (which is basically responseEnd if you ignore the the little used HTTP Trailers) so might be confusing.

@noamr
Copy link
Contributor Author

noamr commented Sep 1, 2022

We don't always need equivalent opposites, like with secureConnectionStart. I think that's OK

@tunetheweb
Copy link
Member

We don't always need equivalent opposites, like with secureConnectionStart. I think that's OK

Yeah I know, but there we don't have secureConnectionEnd at all (though nothing to stop it being added in future), but here we DO have responseHeadersStart - we just call it something else (responseStart).

Anyway, I'll get over it. Just looks a little odd.

@colinbendell
Copy link

I agree with @tunetheweb's proposal. This focuses responseStart and responseEnd to target the whole response of the "final" response code and also preserves legacy interpretations. The added metrics optionally increase the resolution for newer rum providers without burdening these vendors to release patches.

@LPardue
Copy link
Contributor

LPardue commented Sep 1, 2022

Please don't prefix anything with early. If we want to support interim HTTP responses, call it that. Expect the HTTP to define more 1xx codes, because its been a part of HTTP since forever.

If we only want to record the time of the first interim response, that fine. But let's be clear. What these events measure. Maybe today first and last interim response are measured at the same time due to fetch. In the future, if fetch changes to support multiple interim responses, our first and last markers would be ready.

@tunetheweb
Copy link
Member

We don't always need equivalent opposites, like with secureConnectionStart. I think that's OK

Yeah I know, but there we don't have secureConnectionEnd at all (though nothing to stop it being added in future), but here we DO have responseHeadersStart - we just call it something else (responseStart).

Anyway, I'll get over it. Just looks a little odd.

Though guess you could in theory have a responseBodyEnd before responseEnd (e.g. if considering Trailers), and we're just choosing not to include that now as little need for it as responseEnd is sufficient, in the same way as we're not including secureConnectionEnd. And similarly, theoretically responseHeadersStart could be after responseStart, but again we just won't include it now as little need of it.

@tunetheweb
Copy link
Member

Please don't prefix anything with early. If we want to support interim HTTP responses, call it that. Expect the HTTP to define more 1xx codes, because its been a part of HTTP since forever.

If we only want to record the time of the first interim response, that fine. But let's be clear. What these events measure. Maybe today first and last interim response are measured at the same time due to fetch. In the future, if fetch changes to support multiple interim responses, our first and last markers would be ready.

Updated my comment to interimResponseStart instead.

@noamr
Copy link
Contributor Author

noamr commented Sep 1, 2022

Please don't prefix anything with early. If we want to support interim HTTP responses, call it that. Expect the HTTP to define more 1xx codes, because its been a part of HTTP since forever.

HTTP 103 is defined as "Early Hints", that's where early comes from.
1xx is interim in HTTP lingo, but 103 is early in HTML/fetch/browser lingo.
I don't think we want to start measuring other interim responses.

Perhaps earlyHintsStart ?

@tunetheweb
Copy link
Member

Perhaps earlyHintsStart ?

I'm against being that specific. Who knows what the future will bring.

@noamr
Copy link
Contributor Author

noamr commented Sep 1, 2022

Perhaps earlyHintsStart ?

I'm against being that specific. Who knows what the future will bring.

Perhaps when we have new features in the future we find new names for their metrics and be specific about them?
If what this feature does currently is early hints, and it's defined in HTTP, perhaps we should be specific?

@colinbendell
Copy link

The HTTP spec calls these "Informational" responses (1xx). Should we converge on this language?

@tunetheweb
Copy link
Member

Perhaps when we have new features in the future we find new names for their metrics and be specific about them?
If what this feature does currently is early hints, and it's defined in HTTP, perhaps we should be specific?

Possibly. Do we measure WebSockets this way? Should interimResponseStart measure when that 101 response comes in, and before the first bytes sent on the WebSocket?

Similarly HTTP/2 also had the upgrade option with 101 before the "final response" was sent, but that was removed from the latest version.

The HTTP spec calls these "Informational" responses (1xx). Should we converge on this language?

I like this.

@LPardue
Copy link
Contributor

LPardue commented Sep 1, 2022

Only h2c, as defined in RFC 7540, uses 101. The latest revision of HTTP/2 is RFC 9113, which as Barry points out no longer mentions this feature. In most part because browsers do not implement h2c. We should ignore this.

However 101 for WebSocket upgrade over HTTP/1.1 is an actual thing, we should not ignore it and consider if WebSockets and timings are anything we care about here.

103 Early Hints is defined and is seeing use. And there are use cases for other status codes. For example, 104 is provisionally allocated for resumable upload - newly adopted in the IETF HTTP WG. That specific provisional allocation might fizzle out, but let's be open to supporting the well-defined HTTP extension point of the 1xx status ranges; if we fail to do that job, we damage its future potential.

I disagree with calling things informational responses. RFC 9110 doeant use that term, instead it defines interim responses, which use informational status codes. Terminology needs to be precise and consistent unless there is a good reason to diverge.

Case in point, HTTP as defined in RFC 9110 has moved away from the terms payload and body, and now consistently uses the term "content". Ideally, W3C would align. But I believe due to legacy, that such a change would be disruptive However, new things that we are discussing here have no such legacy.

@colinbendell
Copy link

I disagree with calling things informational responses. RFC 9110 doeant use that term, instead it defines interim responses, which use informational status codes. Terminology needs to be precise and consistent unless there is a good reason to diverge.

I'm not sure I follow. Section 15.2 of rfc9110 calls 1xx "informational" class of response codes.

@LPardue
Copy link
Contributor

LPardue commented Sep 1, 2022

Status codes are only a piece of a response message, the other pieces include metadata (in the form of field sections) and content. RFC 9110 does not use the term "informational response" anywhere. In contrast it uses the term "interim response" 4 times to describe a non-final response that includes a 1xx( informational) class of status code.

@dotjs
Copy link

dotjs commented Sep 1, 2022

I'm in favour of using interim as it expresses the non-final and non-authoritative nature of the response. As RFC 9110 states

A single request can have multiple associated responses: zero or more "interim" (non-final) responses with status codes in the "informational" ([1xx] range, followed by exactly one "final" response with a status code in one of the other ranges.

@noamr
Copy link
Contributor Author

noamr commented Sep 2, 2022

Only h2c, as defined in RFC 7540, uses 101. The latest revision of HTTP/2 is RFC 9113, which as Barry points out no longer mentions this feature. In most part because browsers do not implement h2c. We should ignore this.

However 101 for WebSocket upgrade over HTTP/1.1 is an actual thing, we should not ignore it and consider if WebSockets and timings are anything we care about here.

We currently don't record resource timing for WebSockets at all, but I can see the other points.

The main issue I have with interim/informational is that we might not have interim responses at all (most cases...), and also in some cases we'd have more than one, and this measures only the first one.

How about firstResponseStart (which could be interim or the final one) and responseStart?
Otherwise perhaps firstInterimResponseStart and have it be zero if there aren't any. (I don't like having too many zeroes in the resource timing sequence, so not sure it's a favorite)

noamr added a commit to noamr/fetch that referenced this issue Sep 2, 2022
See w3c/resource-timing#345

Since early hints have landed, there are additional useful timestamps
that are currently not exposed:

- First interim response start (e.g. when we received a 103)
  as a different timestamp from the final response start
  (e.g. when we received the 200)

- Final headers received (when the last header has been received and
  we're ready for the body)

- First bytes of the body received

The naming in whatwg#345 is not finalized yet, but it's clear that these
are the 3 interesting timestamps. This PR exposes those for later
use by resource timing.
@noamr
Copy link
Contributor Author

noamr commented Sep 2, 2022

I posted a PR for the fetch spec. It shouldn't affect the naming bike shed, we can choose the names in the subsequent resource timing PR.

@annevk
Copy link
Member

annevk commented Sep 26, 2022

One thing that I'm missing from this thread is a discussion of use cases.

@noamr
Copy link
Contributor Author

noamr commented Sep 27, 2022

One thing that I'm missing from this thread is a discussion of use cases.

It's been discussed in the WebPerfWG call.

The main use case is that servers who serve early hints and might have latency before serving the final headers/body now don't have visibility as to where this latency lies.

@colinbendell
Copy link

There are a few use cases behind the above discussion:

  1. responseStart is previously understood by the industry as 'time-to-first-byte'. Many libraries and rum products in the industry operate with this exception. The strict definition of responseStart breaks the colloquial understanding in the ecosystem.
  2. for use cases where early hints are available, the presence of firstInterimResponseStart will help rum providers and analytics effectively cohort results. This is a current challenge since there is no way to know if the results from clients is an experience where early hints was sent, where early hints were sent and there was >0ms before the http headers arrived or where early hints were not sent.
  3. many frameworks are increasingly returning to http streaming to emit http headers early while the backend continues to process and compose the body. Having responseBodyStart will help segment and cohort experiences in the same way that use case {2} identifies.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 16, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 16, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 16, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 17, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 18, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 19, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
aarongable pushed a commit to chromium/chromium that referenced this issue Jan 19, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 19, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 19, 2023
This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Feb 1, 2023
…onseStart, a=testonly

Automatic update from web-platform-tests
Resource Timing: Expose firstInterimResponseStart

This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}

--

wpt-commits: e75f154bb894b0e2bf78cf1ac04e1cbecedebdc6
wpt-pr: 37984
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Feb 3, 2023
…onseStart, a=testonly

Automatic update from web-platform-tests
Resource Timing: Expose firstInterimResponseStart

This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}

--

wpt-commits: e75f154bb894b0e2bf78cf1ac04e1cbecedebdc6
wpt-pr: 37984
@LPardue
Copy link
Contributor

LPardue commented Feb 16, 2023

Something that might be worth noting (if not already covered) is that in some scenarios, the checkpoints could appear to have the same values at the client, even if sent and different times on the server. This is especially noticeable if the client has large TCP receive windows and reads a "batch" of data after letting it accumulate for a little while.

@noamr
Copy link
Contributor Author

noamr commented Apr 24, 2023

This is now implemented in chrome behind a flag (ResourceTimingInterimResponseTimes):

  • firstInterimResponseStart is the time when the 103 response arrives
  • responseStart is the time when the final response arrives

It applies to resource timing, but probably only useful for navigation timing entries.
I would like to enable this by default soon.

noamr added a commit to noamr/fetch that referenced this issue May 4, 2023
See w3c/resource-timing#345

Since early hints have landed, there are additional useful timestamps
that are currently not exposed:

- First interim response start (e.g. when we received a 103)
  as a different timestamp from the final response start
  (e.g. when we received the 200)

- Final headers received (when the last header has been received and
  we're ready for the body)

- First bytes of the body received

The naming in whatwg#345 is not finalized yet, but it's clear that these
are the 3 interesting timestamps. This PR exposes those for later
use by resource timing.
noamr added a commit that referenced this issue May 4, 2023
Add 3 response times:
- firstInterimResponseStart: the first 103
- responseHeadersEnd: All headers have been received
- responseBodyStart: The body started streaming

Closes #345
Depends on whatwg/fetch#1483
annevk pushed a commit to whatwg/fetch that referenced this issue May 8, 2023
@noamr noamr closed this as completed in 55724b9 May 9, 2023
@annevk
Copy link
Member

annevk commented May 9, 2023

@noamr is there a follow-up bug for the other two timing channels or should this be reopened?

@noamr
Copy link
Contributor Author

noamr commented May 9, 2023

@noamr is there a follow-up bug for the other two timing channels or should this be reopened?

Oh this shouldn't have been closed, I changed from "Closes" to "Bug".

@noamr noamr reopened this May 9, 2023
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this issue Jun 1, 2024
…onseStart, a=testonly

Automatic update from web-platform-tests
Resource Timing: Expose firstInterimResponseStart

This adds an entry to PerformanceResourceTiming:
- firstInterimResponseStart: the time of the first early-hints header

It also changes the meaning of responseStart to be the first
non-informational header (non-103).

Implemented for Quic, Spdy and HTTP.

All behind a feature runtime flag (ResourceTimingInterimResponseTimes)

Spec issue: w3c/resource-timing#345

Bug: 1402089
Change-Id: I2f050788515959e3576f3cf2bd8df13ff848090a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4165825
Reviewed-by: Bence Béky <bnc@chromium.org>
Commit-Queue: Noam Rosenthal <nrosenthal@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1094571}

--

wpt-commits: e75f154bb894b0e2bf78cf1ac04e1cbecedebdc6
wpt-pr: 37984
@tunetheweb
Copy link
Member

tunetheweb commented Sep 26, 2024

As discussed at TPAC 2024 (slides, minutes) the implementation of firstInterimResponseStart has led to an interop issue between implementations that have implemented it, and those that have not.

This is not just related to browsers, but also other tooling (e.g. CrUX and Lighthouse on the Chrome side).

In hindsight, implementing leaving responseStart where it was (as the Early Hints response or the Document response where that did not exist—so always the first response "bytes") and implementing finalResponseHeadersStart (as originally proposed as shown by the title of this issue) as an additional metric would have allowed this additional information to have been implemented in a backwards-compatible way. Every implementation would continue to have the same definition of responseStart but those wanting to have the additional information of the gap between Early Hints and Document response could measure this for browsers that implemented finalResponseHeadersStart.

So the new proposal of this would be to:

  • Reset responseStart back to the Early Hints response (as currently measured by firstInterimResponseStart) in Chrome
  • Remove firstInterimResponseStart from Resource Timing.
  • Add finalResponseHeadersStart to Resource Timing.

There was also a separate, related suggestion to implement headEnd (or bodyStart) to indicate when the HTML <head> is finished so those implementations that do an Early Flush of the head/use HTML streaming with a delay between the initial sending of the <head> and the full <body> contents being generated. This would allow sites using Early Hints to be compared more accurately with those using Early Flush for similar reasons. But I think the above bullets are more important, so we should spawn headEnd/bodyStart into a separate issue if there is interest in that.

Interested to hear feedback on this.

@Krinkle
Copy link
Member

Krinkle commented Oct 12, 2024

In hindsight, implementing leaving responseStart where it was (as the Early Hints response or the Document response where that did not exist—so always the first response "bytes") […] would […] have been […] backwards-compatible […].

As a fairly well-informed web developer, I don't agree with this definition. I would have assumed what you describe, is a breaking change to the knowledge and assumptions I hold. I expect responseStart (and responseEnd for that matter) to describe the Document resource, not any redirect, HTTP/2 push, 103 Early Hints, or whatever we invent next.

Okay. I wrote the above intentionally before I check (or argue about) what the spec says or should say. I wanted to capture what I remember as a web developer who has been interested in and actively using this data for 10 years. Of course, I have also read spec numerous times in the past, and as WebPerf WG member, attended meetings talking about these specs. If anything that means I should know better. Although my participation started after NT finalised, and I don't think I was there when NT2/Resource Timing was discussed.

If I'm not alone in my understanding, do we need to consider this as de-facto required for web compat?

See also w3c/navigation-timing#181 (comment) where @adardesign clearly has the same understanding of responseStart:

TTFB is a broken metric at this point

Anyway, let's check the specs, starting with Navigation Timing 1 API (performance.timing) as described originally at https://www.w3.org/TR/navigation-timing/ anno 2012:

  • Immediately before a user agent starts sending request for the document, record the current time as requestStart.
  • Record the time as responseStart immediately after the user agent receives the first byte of the response.
  • Record the time as responseEnd immediately after receiving the last byte of the response.

[…]

  • responseStart attribute: This attribute must return the time immediately after the user agent receives the first byte of the response from the server, or from relevant application caches or from local resources.
  • responseEnd attribute: This attribute must return the time immediately after the user agent receives the last byte of the current document […].

There is a lot of mentions of "current document" around here. Although indeed, with hindsight, it leaves it out of responseStart and thus left room for ambiguity or future expansion.

https://www.w3.org/standards/history/navigation-timing-2/
https://www.w3.org/TR/2017/WD-navigation-timing-2-20170509/#processing-model

The famous diagram, at the version I remember:

timestamp-diagram
  • Immediately before a user agent starts sending request for the document, record the current time as requestStart.
  • Record the time as responseStart immediately after the user agent receives the first byte of the response.
  • Record the time as responseEnd immediately after receiving the last byte of the response.

[…]

  • responseStart: This attribute must return the time immediately after the user agent receives the first byte of the response from the server, or from relevant application caches.
  • responseEnd: This attribute must return the time immediately after the user agent receives the last byte of the current document or immediately before the transport connection is closed, whichever comes first. […]

Of course, as part of Navigation Timing 2, the details were factored out to Resource Timing, which I must admit I haven't actually read in much detail.

https://www.w3.org/standards/history/resource-timing.html
https://www.w3.org/TR/2017/WD-resource-timing-2-20170329/

  • Immediately before a user agent starts sending the request for the resource, record the current time as requestStart.
  • Record the time as responseStart immediately after the user agent receives the first byte of the response.
  • Record the time as responseEnd immediately after receiving the last byte of the response.

[…]

On getting, the responseStart attribute MUST return as follows: The time immediately after the user agent's HTTP parser receives the first byte of the response (e.g. frame header bytes for HTTP/2, or response status line for HTTP/1.x) […] from the server if the last non-redirected fetch of the resource passes the timing allow check algorithm.

[…]
NOTE:
[…] For fetches composed of multiple requests (e.g. preflights, authentication challenge-response, redirects, and so on), the reported responseStart value is that of the last request. In the case where more than one response is available for a request, due to an Informational 1xx response, the reported responseStart value is that of the first response to the last request.

So indeed, the Resource Timing spec has technically recognised 103 Early Hints for responseStart since at least 2017, very explicitly. Still, I do think there is a potential web compat argument here. I, for one would not, until now, would not have attributed much significant to "frame header bytes". I'm vaguely aware that these frames exist, and aware of how 10x is sent by the server. Yet, I simply didn't make the connection before.

Do we know of cases where developers specifically expected responseStart to become the value of the Early Hints if their application (or, unbeknowst to them, their web host or CDN) starts sending one?

@tunetheweb
Copy link
Member

Do we know of cases where developers specifically expected responseStart to become the value of the Early Hints if their application (or, unbeknowst to them, their web host or CDN) starts sending one?

It's not so much as to what they expect that value to, as to how we can ensure it's cross-browser supported.

When Early Hints came on the scene that meant that responseStart changed. That's the original sin. It didn't introduce a compat issue so much, as all browsers handled this the same way (even browsers that didn't support Early Hints IIRC)—presumably because all were following the spec in the same way. So at that point you can argue developers expectations of what responseStart meant was inconsistent with what it actually meant, but at least it was consistently inconsistent (that's a mouthful!) across implementations.

Could we have set up Early Hints better and changed all the specs and implementation in advance? I think the evidence with firstInterimResponseStart shows this could not have happened.

The "fix" was to introduce firstInterimResponseStart to allow both to be measured, and return responseStart back to "developer expectations". My argument is that in hindsight that was the wrong solution despite doing it's best for what you wanted. Because at that point we had a break between browsers that supported firstInterimResponseStart (where responseStart did not include Early Hints) and those that didn't (where responseStart did include Early Hints). This is not just a compat issue to get all browsers to support that—it includes older browsers too. And without a time machine we can't fix that compat issue.

The only way to fix this, that I can see, is to revert responseStart to the consistent implementation, remove firstInterimResponseStart, and then implement finalResponseHeadersStart. That still leaves the period that Chrome had changed responseStart but that can be worked around by taking max(firstInterimResponseStart, responseStart) in RUM tools going forward.

The alternative other fix is, as you suggest, to try to ensure all browser roll out firstInterimResponseStart. Given it's been over a year since that happened for Chrome, I think realistically we can look at at least another year of that being. And that won't fix older browsers in the wild so all data collected from those forever more will have a wrong reponseStart. And while browsers are upgraded more frequently, there's still plenty of older browsers hanging around. So even if you think that's the right way to do this, it's not an implementable solution IMHO unless we want to measure this solution in decades rather than years.

So while we can argue what people think reponseStart should represent (I think there's arguments for both definitions—and yes, I have heard developers argue for both definitions!!) but if it can't be implemented in a non-breaking, backwards-compatible way, for all browsers going forward, then those arguments are not that useful.

By having both definitions, RUM tooling can show what they want—including having @adardesign's graph display either responseStart or finalResponseHeadersStart (for those browser that implement it). It's a matter of labelling (and the labelling already looks off btw, since it says "CLS" but anyway). If we want to change the TTFB definition (which has no formal definition btw—as per my presentation), then that can be looked at, but IMHO only after the majority of browsers in use have implemented firstInterimResponseStart or finalResponseHeadersStart.

@noamr
Copy link
Contributor Author

noamr commented Oct 12, 2024

@Krinkle re the spec, we specifically changed it to account for firstInterimResponseStart alongside the implementation change. It's here. Before that change, responseStart was measured at the first non-100 non-3xx response, which would be e.g. 103.

@Krinkle
Copy link
Member

Krinkle commented Oct 12, 2024

@tunetheweb Thanks for explaining the cross-browser compat issue in more detail. I see now that prefering the dev expectation I described, does not actually get us an ideal outcome for sites that already adopted Early Hints. Those sites will have had their data "change" in the past already, consistently across browsers, and have had to deal with that one way or the other.

There is an element here of potentially over-optimisating for early adopters, though. We should think about future audiences as well. In a few more years, the duration from "adopt use of Early Hints (break responseStart)" to "fixed responseStart and implemented firstInterimResponseStart in baseline browser usage" will be very small in retrospect. For myself, that duration will be non-existent, since I haven't adopted it yet.

Thinking about when I might adopt it in the future, I'd value continuity of a long-standing metric like responseStart in my time series, going back 5+ years. Even if we shipped finalResponseHeadersStart today in baseline browsers, and I pick it up in metric collection ASAP, it'd have no history. Probably what I'd do, in my metric collection script, is ingest "responseStart": max(finalResponseHeadersStart || 0, responseStart) in order to maintain TTFB continunity. If I do that, however, I will of course run at risk of the browser compat issue you described. If by the time we ship Early Hints, we still have users on browsers that implement the old responseStart definition and lack finalResponseHeadersStart, it'll poison the dataset.

This could be mitigated, by updating our feature test for metric collection, to require support for finalResponseHeadersStart based on property existence, and ignore pageviews from browsers without it.

@tunetheweb
Copy link
Member

As I say there are arguments from both sides as to what the “response start” or “first bytes” are. You clearly have one interpretation in mind (and it’s a perfectly valid interpretation and one that others share too!), but there are also arguments that the “first bytes” are the first opportunity for the browser to start work on loading the page. And Early Hints give you that.

Personally I’m less concerned about which name is used to represent which point, than of 1) reaching compat as soon as possible and for it to be consistently applied for all implementations (including older ones) and 2) being able to measure both. If we have two time points, no matter what they are called, then implementations that implement both can be measured as such. If we have only one point (because some implementations have not implemented the second point yet) then IMHO that should be consistent (and so consistent with the older spec). The current situation is… not great IMHO.

And it’s not just measuring this in browsers. Lots of other measurements of this were built on the original interpretation of the spec. For example we in Chrome didn’t update CrUX—a miss in hindsight when we implemented it in firstInterimResponseStart. I’ve no idea about other tooling, or platforms (e.g. node) or programming languages but I can see how changing the specced definition makes it more likely for this to happen somewhere. As we’ve seen. Unfortunately responseStart and TTFB are common in lots of places.

However I’m delighted to hear your input and look forward to hearing others too. Thats the entire point of asking for feedback in this issue. If lots of other people prefer to continue down the firstInterimResponseStart path (and in particular if other browser vendors agree to this—though in the discussion at TPAC the preference from others was to go the finalResponseHeadersStarr route instead), then that’s less change for the Chrome team that I’m part of. But my fear is this is not a short term compat thing but a longer term one that’ll drag on for a while…

@Krinkle
Copy link
Member

Krinkle commented Oct 12, 2024

@tunetheweb Yeah, I think you make a compelling case. We can't change what older browsers do, so anyone who has adopted it already or is going to adopt it in the next year or two, will, no matter what we agree on here, have to deal with the fact that responseStart shifts to match the Early Hints response. That decision was in the past and we can't change the old implementations.

The easiest to way to prevent poisioning or mixing up data, is to keep it consistent (apart from the most recent Chrome change). The flipside of this is that, usually, when new features are adopted, if you want telemetry on it, you have an incentive at that point in time, to add to or update your metric collection code. Whereas here, you need to add or update it in order to keep even what I thought I already had. That's going to confuse folks a bit, and will be a breaking change to various metrics, alerts, calculations, and visualisations and assumptions built on top of Navigation Timing. The only good news is that (apart from a generic web host or CDN doing it for you) you're likely in charge of adopting this feature on your sites. So it should be fairly easy to explain the change in relation to what you recently did, and then decide whether you like that, or whether you need to start collecting finalResponseHeadersStart and potentially use that as your TTFB metric going forward.

I value data being easy to explain and reason about, and you've convinced me that keeping responseStart unchanged and adding something like finalResponseHeadersStart, is the lesser evil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants