-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Instrument FederationStateIdsServlet
- /state_ids
#13499
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Instrument `FederationStateIdsServlet` (`/state_ids`) for understandable traces in Jaeger. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,7 @@ | |
make_deferred_yieldable, | ||
run_in_background, | ||
) | ||
from synapse.logging.opentracing import start_active_span | ||
from synapse.util import Clock | ||
|
||
if typing.TYPE_CHECKING: | ||
|
@@ -110,6 +111,9 @@ def ratelimit(self) -> "Iterator[defer.Deferred[None]]": | |
def _on_enter(self, request_id: object) -> "defer.Deferred[None]": | ||
time_now = self.clock.time_msec() | ||
|
||
wait_span_scope = start_active_span("ratelimit wait") | ||
wait_span_scope.__enter__() | ||
|
||
# remove any entries from request_times which aren't within the window | ||
self.request_times[:] = [ | ||
r for r in self.request_times if time_now - r < self.window_size | ||
|
@@ -162,6 +166,7 @@ def on_wait_finished(_: Any) -> "defer.Deferred[None]": | |
|
||
def on_start(r: object) -> object: | ||
logger.debug("Ratelimit [%s]: Processing req", id(request_id)) | ||
wait_span_scope.__exit__(None, None, None) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit concerned that this approach could lead to us forgetting to exit the scope in some situations (particularly: on cancellation, or other exceptions such as the place where we one approach we've taken elsewhere is to make the span cover the whole of the limited operation (ie, the wait, and the operation itself), and just emit a tracing event when the limiter completes (or have a nested span). That means you can just do a Failing that, I'd suggest doing this in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We already have a span that covers the wait and the operation itself. I'm interested in exposing how long the wait is though.
Moved to this 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
doesn't emitting a tracing event do that though?
ok then :) |
||
self.current_processing.add(request_id) | ||
return r | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can make a big difference in understanding where all of the time went. In the before, there is just a giant gap with no indication of what's taking up the time. In the after, you can see the
ratelimit wait
span 🎉