Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

state_groups (& state_groups_state & state_group_edges) are not fully purged alongside the rooms #12821

Open
ShadowJonathan opened this issue May 20, 2022 · 21 comments
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Occasional Affects or can be seen by some users regularly or most users rarely roadmap S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@ShadowJonathan
Copy link
Contributor

While investigating my database size, I found that state_groups_state and state_groups hold many stale entries that reference old rooms that have long since been removed from rooms and all related events tables.

It appears these are not purged alongside other tables, particularly via the DELETE _synapse/admin/v1/rooms/{room_id} API (with purge: true)

Version information

  • Version: Synapse 1.57.1

  • Install method: Docker image

  • Platform: Linux (Debian)

@reivilibre reivilibre changed the title state_groups (& _state) does not appear to be purged alongside the rooms state_groups (& state_groups_state) does not appear to be purged alongside the rooms May 20, 2022
@ShadowJonathan
Copy link
Contributor Author

ShadowJonathan commented May 20, 2022

Context: state_groups_state appears to be my biggest table in the database at this point, and I was really questioning why this is the case;

image

Related question: Can I just delete these rows (that reference rooms that my homeserver isnt a part of, and have been purged long ago) with no effect? Am I supposed to?

@reivilibre
Copy link
Contributor

Andrew noticed that there's a note about this in the code...

# "state_groups", # Current impl leaves orphaned state groups around.

@ShadowJonathan
Copy link
Contributor Author

After clearing the tables with non-referenced room names, it seems to have cleared up 4GB of disk space (from 44GB to 40GB, that specific table 12GB -> 8GB)

@clokep
Copy link
Member

clokep commented May 20, 2022

We do seem to delete it now though:

# ... and the state groups
logger.info("[purge] removing %s from state_groups", room_id)
self.db_pool.simple_delete_many_txn(
txn,
table="state_groups",
column="id",
values=state_groups_to_delete,
keyvalues={},
)

I wonder if the test would now pass if you check if those are gone?

(Maybe we need a background update or something to clean-up?)

@richvdh
Copy link
Member

richvdh commented May 23, 2022

possibly related to #3364 (comment)

@DMRobertson DMRobertson added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label May 27, 2022
@squahtx
Copy link
Contributor

squahtx commented May 30, 2022

We do seem to delete it now though:

# ... and the state groups
logger.info("[purge] removing %s from state_groups", room_id)
self.db_pool.simple_delete_many_txn(
txn,
table="state_groups",
column="id",
values=state_groups_to_delete,
keyvalues={},
)

I wonder if the test would now pass if you check if those are gone?

(Maybe we need a background update or something to clean-up?)

I re-ran the tests with the following line uncommented:

# "state_groups", # Current impl leaves orphaned state groups around.

and a bunch of them failed with twisted.trial.unittest.FailTest: 1 != 0 : Rows not purged in state_groups:

  DeleteRoomTestCase
    ...
    test_purge_room_and_block ...                                        [FAIL]
    test_purge_room_and_not_block ...                                    [FAIL]
    ...
    test_shutdown_room_block_peek ...                                    [FAIL]
    test_shutdown_room_consent ...                                       [FAIL]
  DeleteRoomV2TestCase
    ...
    test_purge_room_and_block ...                                        [FAIL]
    test_purge_room_and_not_block ...                                    [FAIL]
    ...
    test_shutdown_room_block_peek ...                                    [FAIL]
    test_shutdown_room_consent ...                                       [FAIL]

So it looks like we're still missing something.

@MadLittleMods MadLittleMods added the A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db label Jun 3, 2022
@H-Shay
Copy link
Contributor

H-Shay commented Aug 12, 2022

It appears that deleting and purging a room can create an orphaned state group when a DELETE _synapse/admin/v1/rooms/{room_id} request arrives for a room where the dag is in a forked condition. This was discovered when Sean and I were debugging why some tests in the DeleteRoomTestCase class were failing on PR I had opened to batch up some of the events in room creation. All of the failing tests were due to an entry for the room being left in the table state_groups_state, which was causing the is_purged assertion in the test to fail. The following scenario illustrates this, and I used actual results from the test purge_room_and_block, setting breakpoints and pulling events/dumping tables to get examples.

Consider the following: a room is created. Three events are sent into the
room, one after another, each one referencing the previous one as it's previous event:

graph TD;
      A[m.room.create] --> B[m.room.member]
      B --> C[m.room.power_levels]
Loading

After this, several events are created and then batched up to be persisted. In the current scenario, these are
a join_rules event and a history_visibility event. These both point to the power_levels event as their
previous event. The room creation code has finished, and now the dag looks like this:

graph TD;
      A[m.room.create] --> B[m.room.member]
    B --> C[m.room.power_levels]
    C -->D[m.room.join_rules]
    C -->E[m.room_history_visibility]
Loading

At this point, the state_groups_state table looks like this:

[(2,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.create',
  '',
  '$212ssXuFh__z84_WwehiG_-HlgZxNCe95MDQ3le3VCs'),
 (3,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.member',
  '@user:test',
  '$o3Lw3gE42sh76kgHN1PNO_782N07D1dBFKEscBWBq3k'),
 (4,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.power_levels',
  '',
  '$ts5UVqqB33pUiYXTvolsNTaA657en2azS4yeLNC4nh0'),
 (5,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.join_rules',
  '',
  '$cGox6YgFZJSxz_je8VkMa6YDHEVhLV8h4zse7lCLaAs'),
 (6,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.history_visibility',
  '',
  '$hhcgaNTJ8zovP17DQESlBx8gurLt_hZp9XekhWawHAs')]

Each event has a separate state group, which makes sense as each event is a state event. Next the delete room code path begins. As part of the shutdown room code, a leave event is created, with two previous events, the current forward
extremities. The dag now looks like this:

graph TD;
   A[m.room.create] --> B[m.room.member]
    B --> C[m.room.power_levels]
    C -->D[m.room.join_rules]
    C -->E[m.room_history_visibility]
    E -->F[m.room.member]
    D -->F[m.room.member]
Loading

And the state_groups_state table looks like this:

[(2,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.create',
  '',
  '$212ssXuFh__z84_WwehiG_-HlgZxNCe95MDQ3le3VCs'),
 (3,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.member',
  '@user:test',
  '$o3Lw3gE42sh76kgHN1PNO_782N07D1dBFKEscBWBq3k'),
 (4,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.power_levels',
  '',
  '$ts5UVqqB33pUiYXTvolsNTaA657en2azS4yeLNC4nh0'),
 (5,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.join_rules',
  '',
  '$cGox6YgFZJSxz_je8VkMa6YDHEVhLV8h4zse7lCLaAs'),
 (6,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.history_visibility',
  '',
  '$hhcgaNTJ8zovP17DQESlBx8gurLt_hZp9XekhWawHAs'),
 (7,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.history_visibility',
  '',
  '$hhcgaNTJ8zovP17DQESlBx8gurLt_hZp9XekhWawHAs'),
 (8,
  '!DIbdgobrpTVoFAztzx:test',
  'm.room.member',
  '@user:test',
  '$lDvyNohshfEZwqE79ggqkc83sOVZWvdoLukoLDz_xMU')]

You can see that now the same history visibility group event is recorded in two different state groups.
Events have two state groups, one before the event and one after the event. When state events are in a linear chain, an event's "before" state group is the same as the previous event's "after" state group, but when the dag is forked, this is no longer the case. In the table above, state group 6 is the room history visibility event's "after" state group, and state group 7 is the leave event's "before" state group. Since the event_to_state_groups table only stores the "after" state groups, when the _purge_room_txn code fetches the state groups to be deleted, it leaves behind the second history_visibility state group.
The code in question, synapse/storage/database/main/purge_events lines 354-365,

# First, fetch all the state groups that should be deleted, before
        # we delete that information.
        txn.execute(
            """
                SELECT DISTINCT state_group FROM events
                INNER JOIN event_to_state_groups USING(event_id)
                WHERE events.room_id = ?
            """,
            (room_id,),
        )

        state_groups = [row[0] for row in txn]

results in state_groups resolving to [2, 3, 4, 5, 6, 8]. State group 7 is not selected to be deleted. This list is eventually returned and passed to purge_room_state on line 37 of synapse/storage/controllers/purge_events.py, but since state group 7 is not in the list, it is not deleted and thus becomes an orphan.

@rettichschnidi
Copy link
Contributor

After clearing the tables with non-referenced room names, it seems to have cleared up 4GB of disk space (from 44GB to 40GB, that specific table 12GB -> 8GB)

Any opinions on whether this is safe and not corrupting the database even more?

@richvdh
Copy link
Member

richvdh commented Aug 18, 2022

I think this is basically a duplicate of #3364 for the record.

Any opinions on whether this is safe and not corrupting the database even more?

provided you're sure that all other references to the room have been purged from the database, yes it should be safe to remove the unreferenced rows from state_groups and state_groups_state.

@tonkku107
Copy link

tonkku107 commented Aug 19, 2022

@richvdh the query you provide in that issue for checking if you are affected or not doesn't give me a count in the thousands (it's around 500).
I can see state groups for rooms which have been deleted with the admin api however, and one of them even had 530,636 rows in the state_groups_state table. (I compressed it without realizing it wasn't even meant to be there at first and it went down to 425,268)

@rettichschnidi
Copy link
Contributor

provided you're sure that all other references to the room have been purged from the database, yes it should be safe to remove the unreferenced rows from state_groups and state_groups_state.

How would I determine if all references have been purged?

Asking because there are many tables with a column named room_id:

$ sed -n '/CREATE TABLE/{:start /;/!{N;b start};/room_id/p}' synapse-schema.sql | grep "CREATE TABLE" | cut -c 21- | tr -d \( | grep -v -e state_groups_state -e state_groups
appservice_room_list
batch_events
blocked_rooms
current_state_delta_stream
current_state_events
destination_rooms
device_lists_changes_in_room
e2e_room_keys
event_auth
event_auth_chain_to_calculate
event_backward_extremities
event_edges
event_forward_extremities
event_json
event_labels
event_push_actions
event_push_summary
event_reports
event_search
event_txn_id
events
federation_inbound_events_staging
insertion_event_edges
insertion_event_extremities
insertion_events
local_current_membership
partial_state_events
partial_state_rooms
partial_state_rooms_servers
pusher_throttle
receipts_graph
receipts_linearized
room_account_data
room_aliases
room_depth
room_memberships
room_retention
room_stats_current
room_stats_earliest_token
room_stats_state
room_tags
room_tags_revisions
rooms
state_events
stream_ordering_to_exterm
user_directory
users_in_public_rooms
users_who_share_private_rooms

Would I need to check every single table?

@richvdh
Copy link
Member

richvdh commented Aug 30, 2022

@rettichschnidi check the events table. If there's nothing there for your room, you can remove the room from state_groups and state_groups_state.

@rettichschnidi
Copy link
Contributor

rettichschnidi commented Aug 30, 2022

@richvdh Thanks for the answer. Just did so, reduced the database from 75 to below 35 GB.

Follow up question: When deleting a room, not just state_groups and state_groups_state keep entries, but current_state_delta_stream and device_lists_changes_in_room too. Is it safe to delete those too?

@rettichschnidi
Copy link
Contributor

rettichschnidi commented Oct 29, 2022

So, two months later, still with retention enabled, the DB grew again to 76 GB. The room !YTvKGNlinIzlkMTVRl:matrix.org (Element Web/Desktop) had 184082350 (!) rows in state_groups_state, which seems an awful lot. Same goes for !xBRJadUtxNCiIKdwZT:matrix.org (Cybersecurity-General, 17632406 rows) and !HiMrQwoxwdnWgrTVly:matrix.org (FOSDEM, 13416609 rows). Are those rooms really THAT active/big?

In order to shrink my DB once gain, I did the following:

  1. Removed those rooms, e.g.:
curl 'http://localhost:8008/_synapse/admin/v1/rooms/!YTvKGNlinIzlkMTVRl:matrix.org' -X DELETE -H 'Accept: application/json' -H 'Referer: http://localhost:8080/' -H 'authorization: Bearer $TOKEN' --data '{ "purge": true, "message": "Sorry - kicking you out to clean up the database" }'
  1. Shut down Synapse and cleaned up the database like this:
DELETE FROM 
  state_groups_state 
WHERE 
  room_id IN (
    SELECT 
      DISTINCT(state_groups.room_id) AS room_id_gone 
    FROM 
      state_groups 
      LEFT JOIN events USING(room_id) 
    WHERE 
      events.room_id IS NULL 
    GROUP BY 
      room_id_gone
  );

I joined using state_groups instead of state_groups_state for performance reasons before cleaning it up as well:

DELETE FROM 
  state_groups 
WHERE 
  room_id IN (
    SELECT 
      DISTINCT(state_groups.room_id) AS room_id_gone 
    FROM 
      state_groups 
      LEFT JOIN events USING(room_id) 
    WHERE 
      events.room_id IS NULL 
    GROUP BY 
      room_id_gone
  );
  1. Reclaimed the space on the file system:
REINDEX (VERBOSE) DATABASE synapse; VACUUM FULL VERBOSE;

As a result of this, my DB shrunk to 16 GB (on the file system).

  1. Restarted Synapse

Open question: Some other tables also return small to big numbers, an I am wondering if it is save to delete the found entries?

synapse=# SELECT FROM device_lists_changes_in_room WHERE room_id IN (SELECT DISTINCT(device_lists_changes_in_room.room_id) AS room_id_gone FROM device_lists_changes_in_room LEFT JOIN events USING(room_id) WHERE events.room_id IS NULL GROUP BY room_id_gone);
--
(1230238 rows)

The other tables being:

  • current_state_delta_stream: 3405301 rows
  • e2e_room_keys: 71
  • event_reports: 1 row
  • receipts_graph: 2742 rows
  • receipts_linearized: 2742 rows
  • room_account_data: 4 rows
  • room_retention: 4 rows
  • room_tags_revisions: 61 rows

@Dan-Sun
Copy link

Dan-Sun commented Nov 1, 2022

@rettichschnidi
Are there any problems since your cleanup? We probably have the same problem and would also like to get rid of the deleted rooms. I'm just curious if there were any other problems after that?

@rettichschnidi
Copy link
Contributor

Are there any problems since your cleanup? We probably have the same problem and would also like to get rid of the deleted rooms. I'm just curious if there were any other problems after that?

I have not experienced any myself and also have not gotten any complains yet. Will update here if this changes.

@Dan-Sun
Copy link

Dan-Sun commented Nov 4, 2022

Are there any problems since your cleanup? We probably have the same problem and would also like to get rid of the deleted rooms. I'm just curious if there were any other problems after that?

I have not experienced any myself and also have not gotten any complains yet. Will update here if this changes.

Thank you for testing this. We did that too and so far so good. We even went from about 115 GB down to 13 GB.

@DMRobertson DMRobertson added S-Minor Blocks non-critical functionality, workarounds exist. O-Occasional Affects or can be seen by some users regularly or most users rarely labels Nov 10, 2022
@marvinwankersteen
Copy link

Very nice @rettichschnidi!
This reduced the database size from 10 to 4 GB.

@richvdh richvdh changed the title state_groups (& state_groups_state) does not appear to be purged alongside the rooms state_groups (& state_groups_state) are not fully purged alongside the rooms Dec 16, 2022
@richvdh richvdh changed the title state_groups (& state_groups_state) are not fully purged alongside the rooms state_groups (& state_groups_state & state_group_edges) are not fully purged alongside the rooms Dec 16, 2022
@boontifex
Copy link

There are also references of room_id in table current_state_delta_stream which held the reference stream_id to table cache_invalidation_stream_by_instance.

cache_invalidation_stream_by_instance has a column cache_func. Any ideas how this is been used and whats the correct way?

In addition to @rettichschnidi i made the following draft:

DELETE FROM
  cache_invalidation_stream_by_instance as inv
  INNER JOIN current_state_delta_stream as stream ON inv.stream_id = stream.stream_id
WHERE room_id IN (
    SELECT 
      DISTINCT(state_groups.room_id) AS room_id_gone 
    FROM 
      state_groups 
      LEFT JOIN events USING(room_id) 
    WHERE 
      events.room_id IS NULL 
    GROUP BY 
      room_id_gone
  )
AND cache_func = 'have_seen_event';

@richvdh
Copy link
Member

richvdh commented Feb 2, 2023

There are also references of room_id in table current_state_delta_stream which held the reference stream_id to table cache_invalidation_stream_by_instance.

I am far from convinced those columns are referring to the same "stream", though I haven't actually checked. Both of those tables can be periodically cleaned out anyway; see #5888 and #13456.

@boontifex
Copy link

@richvdh I didn't really find the comprehensive documentation how synapse stores data and why, but i've give some thoughts.

Events (joining, leaving, messages etc) are stored in the tables events and events_json.

IMO every entry in the following tables, that doesn't have a related event_id in events is orphaned and can be deleted. Or am I wrong?

  • cache_invalidation_stream_by_instance
  • current_state_delta_stream
  • device_lists_changes_in_room
  • state_groups
  • state_groups_state
  • cache_invalidation_stream_by_instance

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Occasional Affects or can be seen by some users regularly or most users rarely roadmap S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests