Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many duplicate _get_state_groups_from_groups queries leading to OOMs #10301

Open
matrixbot opened this issue Dec 18, 2023 · 0 comments
Open

many duplicate _get_state_groups_from_groups queries leading to OOMs #10301

matrixbot opened this issue Dec 18, 2023 · 0 comments

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 18, 2023

This issue has been migrated from #10301.


We had a federation sender instance which died. On restart, it rapidly consumed all available ram and OOMed again.

Inspection from postgres side shows it is doing many duplicate queries of the form

WITH RECURSIVE state(state_group) AS ( VALUES(3405820::bigint) UNION ALL SELECT prev_state_group FROM state_group_edges e, state s WHERE s.state_group = e.state_group ) SELECT DISTINCT ON (type, state_key) type, state_key, event_id FROM state_groups_state WHERE state_group IN ( SELECT state_group FROM state ) ORDER BY type, state_key, state_group DESC

This query is _get_state_groups_from_groups, which is called from _get_state_for_groups. Although the latter has a cache, if many threads hit it at the same time, all will find the cache empty and go on to hit the database. I think we need a zero-timeout ResponseCache on _get_state_groups_from_groups.

@matrixbot matrixbot changed the title Dummy issue many duplicate _get_state_groups_from_groups queries leading to OOMs Dec 21, 2023
@matrixbot matrixbot reopened this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant