-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/backfill duplicate handling #72
Conversation
…gistration events
8f51d4c
to
58d0fba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and looks good for the most part. I am running into one issue while testing though. With these changes checked out, I see that all events will trigger exactly one duplicate before being ignore by the data backfill script.
For example, Here I have created 3 users (test4, test5, and test6@example.com), and each have triggered an initial reader_registered event:
When I run the backfill script the first time, each one triggers an additional reader_registered event:
It isn't until the second time I run this command that duplicates are actually ignored:
Is this expected?
Not expected, you've uncovered a shortcoming of this solution. It assumes the only metadata missing will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ca136ae
to
ee7be33
Compare
This is a bigger issue than I thought! The code is comparing timestamps. The event will be saved a few milliseconds after the actual registration. Thus when looking for an event at time of the registration, it won't be found. Fixed in ee7be33 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @adekbadek!
This is working now so gonna give a thumbs up.
I did have one question/nit which is can we simply disregard/drop timestamp as well if the only relevant information we need to consider are email and date? Definitely not a blocker though and something we can revisit if necessary.
Good suggestion! For registrations that makes sense, because there should be just one per email. The timestamp-range solution should also work for other events, though, and for these disregarding the timestamp altogether is not desirable. There might be many of them, with only the timestamp as differentiator (e.g. donations). |
🎉 This PR is included in version 1.3.0-alpha.3 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
# [1.3.0](v1.2.0...v1.3.0) (2024-02-28) ### Bug Fixes * backfill duplicate handling; woo links in wp-admin-bar ([#71](#71), [#72](#72)) ([bbce13b](bbce13b)) * cli commands ([#73](#73)) ([ff563ac](ff563ac)) * memberships sync ([#63](#63)) ([0a54f1d](0a54f1d)) * missing woocommerce-memberships handling ([76dbdf7](76dbdf7)) ### Features * add hub bookmarks to nodes ([#56](#56)) ([54700a0](54700a0)) * add option to manually sync users ([#53](#53)) ([3ec1b19](3ec1b19)) * display membership plans from the network ([#59](#59)) ([6fa01fd](6fa01fd)) * enhance network subscriptions and orders search ([#55](#55)) ([bcb0615](bcb0615)) * Node connection using a link ([#58](#58)) ([721f8b2](721f8b2)) * **wp-cli:** add command to backfill events ([#51](#51)) ([13d2498](13d2498)) * **wp-cli:** add process pending webhook requests command ([#67](#67)) ([7dbd8dc](7dbd8dc)) * **wp-cli:** sync all events from the Hub ([#65](#65)) ([dc595ca](dc595ca))
🎉 This PR is included in version 1.3.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 1.4.0-alpha.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 1.5.0-alpha.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 1.5.0-epic-data-integrity-improvements.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 1.5.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
All Submissions:
Changes proposed in this Pull Request:
Fixes two issues:
reader_registration
events, disregardcurrent_page_url
metadata.By accident this branch also contains #71, but it won't hurt to add that to the
alpha
release.How to test the changes in this Pull Request:
alpha
,reader_registered
event by registering as a new reader on the sitemetadata.current_page_url
key in thedata
propertyreader_registered
events (e.g.wp newspack-network data-backfill reader_registered --start=2024-02-26 --end=2024-12-31 --verbose
)reader_registered
event added, even though areader_registered
for this reader is already presentmetadata.current_page_url
keyOther information: