Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double post in fediverse -> AtProto #1063

Closed
TomCasavant opened this issue May 20, 2024 · 19 comments
Closed

Double post in fediverse -> AtProto #1063

TomCasavant opened this issue May 20, 2024 · 19 comments
Labels

Comments

@TomCasavant
Copy link

https://bsky.app/profile/tom.tomkahe.com.ap.brid.gy/post/3ksumarmd4j52 top level post
https://bsky.app/profile/tom.tomkahe.com.ap.brid.gy/post/3ksuuv77l5pn2 reply
https://bsky.app/profile/tom.tomkahe.com.ap.brid.gy/post/3ksuuv6ivb2p2 duplicate reply (https://tomkahe.com/@tom/112469674016938064)

I think this happened to me a few weeks ago as well but I can't find it. In both cases it was just in the replies and not a top level post

@snarfed
Copy link
Owner

snarfed commented May 20, 2024

So odd! Sorry for the trouble. I got another report of this today too, #947 (comment), but that one was seemingly caused by editing the Mastodon reply, which doesn't look like it happened here. No clue what's going on yet.

@TomCasavant
Copy link
Author

No worries, just figured it's one of those weird bugs you'd want examples of to diagnose

@TomCasavant
Copy link
Author

Oh that's interesting, same thing popped up when I replied to your post. https://snarfed.org/2024-05-20_53092

image

@snarfed
Copy link
Owner

snarfed commented May 20, 2024

Yes! I saw that too. Those are webmentions, but maybe still related. And ideally BF wouldn't send two, but they have the same source URL, and webmentions should be idempotent, so the fact that two separate comments got created is likely a bug in the WordPress webmention plugin.

@snarfed
Copy link
Owner

snarfed commented May 20, 2024

Aha. One part of the root cause here is that we're getting the same activity delivered to us multiple times to different hosts. Eg @TomCasavant's instance delivered the reply activity in #1063 (comment) twice, once to web.brid.gy and once to bsky.brid.gy.

#685 would help here, but I'd also like to make sure that processing is idempotent all the way through regardless. It is for sending webmentions, but evidently not for creating records in ATProto repos, we're duplicating the same post with different tids (rkeys). Hrm. Ideally we should map input objects to tids deterministically.

@snarfed
Copy link
Owner

snarfed commented May 24, 2024

I made an infrastructure change yesterday that should hopefully prevent these duplicates. Please let me know if you see any new ones!

I still want to map AP ids to ATP tids deterministically, I'll leave this open to track that.

snarfed added a commit that referenced this issue May 29, 2024
@snarfed
Copy link
Owner

snarfed commented May 31, 2024

@TomCasavant have you seen this happen any more in the last week? I may skip making ids => tids deterministic, the new idempotence may be enough.

@TomCasavant
Copy link
Author

I haven't seen it happen again

@snarfed
Copy link
Owner

snarfed commented May 31, 2024

Thanks!

@snarfed snarfed closed this as completed May 31, 2024
@mackuba
Copy link

mackuba commented May 31, 2024

This still seems to be happening in some form (not sure if it's the same issue)...

Example:
Post: https://bsky.app/profile/did:plc:zqnkaserlfmoyfizucbtmpw4/post/3ktokhwljg4f2

If you query for did:plc:zqnkaserlfmoyfizucbtmpw4 and 3ktokhwljg4f2 in https://atproto.tools, you get two post creates and no updates or deletes: https://atproto.tools/records?did=did:plc:zqnkaserlfmoyfizucbtmpw4&collection=app.bsky.feed.post&rkey=3ktokhwljg4f2

Screen Shot 2024-06-01 at 00 55 53

They're 189 seqs apart, so that's something like 3-4 seconds, and they have the exact same JSON content. Also the original post at https://indieweb.social/@jaredwhite/112528096281782994 doesn't seem to have been edited, so it looks more like some kind of race condition thing.

A query in my database for duplicate rkeys from the Bridgy PDS shows several such cases from the last 2 days:

repo                              rkey           n
--------------------------------  -------------  -
did:plc:3zmgk6ltrvrobp7uuoechqnl  3ktrhwl7mcca2  2
did:plc:6oopxocfjkevkh3ktunkct3f  3ktqvrttltpa2  2
did:plc:aot633jq4fzdbejqflolsq4h  3ktrh27lk64a2  2
did:plc:e7swajiiqv5deci34fn2pon5  3ktrlfbexb2a2  2
did:plc:f4wptjuakccjndanesgcpbyy  3ktsn3cmguza2  2
did:plc:fw5hgb3x45xeazgtmphhuyx7  3ktoh3swd5bf2  2
did:plc:gg77efagxhons7wwxbjfeqrx  3ktqrt7yq2ta2  2
did:plc:gg77efagxhons7wwxbjfeqrx  3ktr4ycvkmxa2  2
did:plc:h5cyfgzmibjc7c5n5ph5czqr  3ktp7ikolnbw2  2
did:plc:haymkinjurgmo37dl4wmbtpo  3ktsjcrb4sda2  2
did:plc:hd4cgld5pxpy7kpsilxu62u6  3ktovf3qrdhw2  2
did:plc:hncngtu4hleej54nqvqy3zve  3ktrjxugvvba2  2
did:plc:ju6k7dob4f74h5rdymughwz6  3ktsn3nv7dpa2  2
did:plc:kir7hkjqs6k4zmnix6r5hbw4  3ktontigs7cw2  2
did:plc:lglhkumc3c7mecjxfdlzwyux  3ktsx6b4dced2  2
did:plc:lxf6nbzgcphkzhbjzdhz24wa  3ktqhx6yl42w2  2
did:plc:twerjlabpqmvieyx3roydios  3ktpjlfggohw2  2
did:plc:zfnr7z743mhdmnljqgmytllw  3ktooak533gw2  2
did:plc:zqnkaserlfmoyfizucbtmpw4  3ktokhwljg4f2  2

@snarfed
Copy link
Owner

snarfed commented Jun 1, 2024

@mackuba interesting, thanks! Yeah that's a different issue. This issue was for user-visible duplicates, ie the same fediverse post resulting in two different Bluesky posts with different tids.

Multiple create events for the same tid/record obviously isn't ideal either, you're right, but definitely lower priority.

@snarfed
Copy link
Owner

snarfed commented Jun 30, 2024

Ugh. Thank you! Reopening. Seems pretty clear there's a race condition here somewhere that I need to track down.

@TomCasavant
Copy link
Author

If it helps I've got a few more examples of this

https://bsky.app/profile/tom.tomkahe.com.ap.brid.gy/post/3kwhulyzmmrz2 - the reply to this is duplicated and the reply to that reply is duplicated

https://bsky.app/profile/tom.tomkahe.com.ap.brid.gy/post/3kw6dpxgmzs72 and the reply to this one is duplicated

It seems to be happening more often or at least everytime I reply to my own status

@TomCasavant
Copy link
Author

TomCasavant commented Jul 4, 2024

@snarfed snarfed changed the title Double post in replies fediverse -> AtProto Double post in fediverse -> AtProto Jul 5, 2024
@snarfed
Copy link
Owner

snarfed commented Jul 5, 2024

It does seem to be happening a bit more often, agreed. 😕 We can track all duplicates here though, I doubt the root cause is different for replies vs top level posts.

@snarfed
Copy link
Owner

snarfed commented Jul 11, 2024

I'm looking at this now. Awkward embarrassing bug, definitely needs to be fixed ASAP.

snarfed added a commit that referenced this issue Jul 11, 2024
for #1063, should fix it. coming up with the test for this was fun!
@snarfed
Copy link
Owner

snarfed commented Jul 11, 2024

I deployed ^ 89e2372 a bit ago, it should hopefully fix this.

@snarfed
Copy link
Owner

snarfed commented Jul 11, 2024

Tentatively closing. Please reopen if you see any more dupes!

@snarfed snarfed closed this as completed Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants