This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Delete messages from
device_inbox
table when deleting device #10969Delete messages from
device_inbox
table when deleting device #10969Changes from 15 commits
e2fc0ac
12d20b8
9f46c0f
40ee652
09a0b67
a543a27
1994a2a
6b74a0e
26faaff
d42c17c
f484316
1ebfc7a
38ca3c9
ca72c96
33e366d
53ef462
e6784f2
b3cd342
9a849bd
c4e92f3
c17eb78
8e7f8fb
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( I forgot that device IDs are not globally unique, so I think we need to ensure that these queries handle
(user_id, device_Id)
as the unique data.Unfortunately I think that complicates the deletion clause quite a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think about the best query and did some tests.
left join
anddistinct
Results has 515 rows
Same cost if I use
group by
insteaddistinct
.left join
Result has 51353 rows
left join
in sub query anddistinct
Result has 6 rows
not in
Result has 51353 rows
not in
anddistinct
Result has 515 rows
not in
in sub query with anddistinct
withlimit
not in
in sub query withlimit
anddistinct
Result has 6 rows
IMO the last one is the best solution. The disadvantage is that the cleanup of the database can take a lot of time, but with small costs.
@clokep Do you have better suggestions or an other opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not exists
Result has 51353 rows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way it works for me. As you can see it uses the already existing index which speeds up the process. Otherwise it will easily run over days / weeks and blocking the database when writing the result for a long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trick is, to get the highest stream_id, generate small chunks (1000 is a good value) of stream_id ranges and execute the query as background update. Hope this helps, i tested it with a bash script manually to find good values, but was unable to write it in python, so i am very happy to see someone is continuing. I dropped in my pull request the background job related things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
device_id
is not unique. Multiple users can have the samedevice_id
. Unique is the tuple ofdevice_id
anduser_id
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, my posted delete query is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cost of the delete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no
streaam_id
230000.