-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use inserted_at to determine which forms/cases need updating #159
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orangejenny ES lag should only be an issue if one Kafka partition is ahead of the another. The DET uses the modification time of the last seen document as the filter value for the next batch and not the 'time of run'.
Either way I think this is a good change if we are OK with the reindexing issue.
Also FYI:
@snopoke How often does that happen? Do you think that makes it less plausible that this change will address the reported issue? |
Looking at Datadog it seems to happen often enough that it could certainly be an issue: https://app.datadoghq.com/notebook/278293/change-lag-by-kafka-partition-case-form-pillows |
@snopoke Thank you! There's one specific example from the client, which is a case that the client's sql db says was last modified |
I think I linked the wrong PR in my previous comment. Here's the correct link: #124 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build still failing
e345ad8
to
1a3261b
Compare
still working out some test failures |
4f02fa0
to
8d5ea26
Compare
8d5ea26
to
6dbad16
Compare
6dbad16
to
14baad6
Compare
This seems like a great idea to me. I can't imagine any scenario where users would want visibility into this weird internal behavior, so hiding it seems like better overall UX. |
@czue @orangejenny this is ready for review. Changes as follows:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this looks great.
Caveats: I only reviewed code from 0f76ab9, and some of the changes to the tests made my head hurt a bit and I didn't fully take the time to understand them (but mostly made sense/looked good).
No need to change this - but I was wondering if we can safely flip from legacy to indexed_on
in a regular checkpointed update. I think you would still sort/filter by the old value, but then theoretically you could use the indexed_on
of the last doc to start the next round. There's probably an edge case in there that doesn't make it perfect (two docs, one has higher modified on and the other has higher indexed on that happen to fall right at the barrier?). You could also use an earlier indexed_on, but then you run the risk of doing extra work the next time around... 🤷♂️
commcare_export/commcare_minilinq.py
Outdated
SUPPORTED_RESOURCES = { | ||
'form', 'case', 'user', 'location', 'application', 'web-user' | ||
} | ||
class FormFilterSinceParams(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(assume this was just restoring the old class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
tests/test_cli.py
Outdated
@@ -421,7 +429,7 @@ class MockCheckpointingClient(CommCareHqClient): | |||
to return mocked data. | |||
|
|||
Note this client needs to be re-initialized after use.""" | |||
def __init__(self, mock_data): | |||
def __init__(self, mock_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think it got fixed in a later commit also)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/test_cli.py
Outdated
try: | ||
objects = mock_requests.pop(key) | ||
except KeyError: | ||
print(mock_requests.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if this is intentional could consider adding a comment explaining why it's being printed out. print statements usually trigger a warning in my brain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not intentional, I'll remove it. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._check_checkpoint(checkpoint_manager, '2012-04-24T05:13:01', 'doc 2') | ||
|
||
def test_cli_pagination_since(self, writer, all_db_checkpoint_manager): | ||
"""Test that we use to the new pagination mode when using 'since'""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this confused for a moment since it seems odd that you'd ever have a checkpoint and then manually override it. what's the use case for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if there is a use case but it's not disallowed so just wanted to test it. I also realized now that if you pass in --since
or --until
then we don't do checkpointing at all. I'm going to update the test to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry meant to approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Thanks for pushing this past the finish line.
This is what I was suggesting here: #159 (comment) but it's not straight forward to get the ID of the last doc from tables since they may not have a suitable column to sort on and the ID may also not be just the doc ID (in the case of exporting form repeats or cases from forms). We'd also then need to look up that doc in the API to get the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for follow ups!
🎊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link in the log warning message to the Wiki does not exist, where did you want it to go? It currently points to: "https://wiki.commcarehq.org/display/commcarepublic/CommCare+Export+Tool+Release+Notes"
The wiki and pypi all seem to refer back to GitHub Releases for release notes: |
Oops, thanks for the heads up! Confirming the last link is the correct one. #177 |
https://dimagi-dev.atlassian.net/browse/USH-277
A client is reporting that occasionally properties aren't syncing correctly. @calellowitz helpfully came up with a theory that this is due to ES lag:
The client is running the DET frequently, every 5 minutes.
This PR changes the date filtering to use
inserted_at
, the time the pillow inserted the item, instead of the server modified time. I don't think this will cause the exported last modified date to change, since it only updates the filters for getting forms/cases and then the pagination.This will cause all case/forms to be resynced if the case or xform mappings change, although those mappings are quite stable. I'd guess it'll also cause resyncing when ES is upgraded, which might be a performance problem for us and/or for clients, although planning for that that seems better than living with the existing bug.