This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Draft: /messages
investigation scratch pad1
#13440
Closed
MadLittleMods
wants to merge
25
commits into
madlittlemods/11850-migrate-to-opentelemetry
from
madlittlemods/13356-messages-investigation-scratch-v1
Closed
Draft: /messages
investigation scratch pad1
#13440
MadLittleMods
wants to merge
25
commits into
madlittlemods/11850-migrate-to-opentelemetry
from
madlittlemods/13356-messages-investigation-scratch-v1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lemods/13356-messages-investigation-scratch-v1 Conflicts: synapse/api/auth.py
MadLittleMods
added
the
T-Task
Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
label
Aug 3, 2022
MadLittleMods
commented
Aug 3, 2022
Comment on lines
+201
to
+203
# It does not seem like the agent can keep up with the massive UDP load | ||
# (1065 spans in one trace) so lets just use the HTTP collector endpoint | ||
# instead which seems to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this is the case? I was seeing this same behavior with the Jaeger opentracing
stuff. Is the UDP connection being over saturated? Can the Jaeger agent in Docker not keep up? We see some spans come over but never the main servlet overarching one that is probably the last to be exported.
But using the HTTP Jaeger collector endpoint seems to work fine for getting the whole trace.
…ittlemods/13356-messages-investigation-scratch-v1
4 tasks
MadLittleMods
commented
Aug 8, 2022
…ittlemods/13356-messages-investigation-scratch-v1 Conflicts: pyproject.toml synapse/logging/tracing.py
This was referenced Aug 10, 2022
MadLittleMods
added a commit
that referenced
this pull request
Aug 16, 2022
…ittlemods/13356-messages-investigation-scratch-v1 Conflicts: synapse/federation/federation_client.py synapse/handlers/federation.py synapse/handlers/federation_event.py synapse/logging/tracing.py synapse/storage/controllers/persist_events.py synapse/storage/controllers/state.py synapse/storage/databases/main/events_worker.py synapse/util/ratelimitutils.py
…ittlemods/13356-messages-investigation-scratch-v1 Conflicts: poetry.lock synapse/handlers/federation.py
…ittlemods/13356-messages-investigation-scratch-v1
@MadLittleMods Is this useful or have you gleaned everything you can from it? |
…ittlemods/13356-messages-investigation-scratch-v1 Conflicts: synapse/handlers/federation.py synapse/handlers/relations.py
MadLittleMods
added
the
A-Messages-Endpoint
/messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill)
label
Apr 25, 2023
17 tasks
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
A-Messages-Endpoint
/messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill)
T-Task
Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Part of #13356
Combine:
/messages
@trace
decorations, Instrument/messages
for understandable traces in Jaeger #13368So that I can run against Complement federation tests and see if there is more to add
@trace
to in the federation stack of things when/messages
happens.Optimization ideas
We load a lot of state (from 2. in #13356)
In
#matrixhq
there are 40k current members and I assumeget_current_state
is the root cause why weLoaded 79277 events
(seems like that took 17s too). We only callget_current_state
in order to get a list of likely domains to backfill from.We could optimize this by:
get_domains_from_state
so we don't have toget_current_state
as muchget_domains_from_state
in the background so it's ready by the time we fail with the first couple of domains.Skip backfill
Skip backfill or kick it off in the background if it's not our first time and we have enough events.
We don't want to get stuck on the same unfetchable event over and over.
Why is
/state_ids
slow to respond?We can't control every bad network effect but maybe Synapse is slow to assemble a
/state_ids
reponse 🤔 Need to investigateFederationStateIdsServlet
FederationStateIdsServlet
-/state_ids
#13499)We should only care about
auth_event_ids
We should only care about getting the
event_id
andauth_event_ids
in_get_state_ids_after_missing_prev_event(...)
We shouldn't factor
state_event_ids
into whetherDev notes
Jaeger max duration spans
213503982d 8h
, see #13440 (comment)Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)