-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trigger Geopoint ES Index on Geospatial Feature Flag Enable #35126
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks mostly in good shape, have a few suggestions regarding task tracking and a few names that I have shared above. Thanks for adding tests for the changes.
@@ -68,8 +68,7 @@ def template_context(self): | |||
{'id': p.id, 'name': p.name, 'geo_json': p.geo_json} | |||
for p in GeoPolygon.objects.filter(domain=self.domain).all() | |||
], | |||
'es_indexing_message': celery_task_tracker.get_message(), | |||
'is_error_es_index_message': celery_task_tracker.is_error(), | |||
'task_status': celery_task_tracker.get_status(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good from my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
corehq/apps/geospatial/templates/geospatial/case_management_base.html
Outdated
Show resolved
Hide resolved
corehq/apps/geospatial/management/commands/index_geolocation_case_properties.py
Outdated
Show resolved
Hide resolved
corehq/apps/geospatial/es.py
Outdated
@@ -136,16 +136,20 @@ def mid(lower, upper): | |||
return ceil(lower + (upper - lower) / 2) | |||
|
|||
|
|||
def case_query_for_missing_geopoint_val(domain, geo_case_property, case_type=None, size=None, offset=0): | |||
def case_query_for_missing_geopoint_val( | |||
domain, geo_case_property, case_type=None, size=None, offset=0, should_sort=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change here.
nit: In order to give better control to the calling function, you could consider allowing to pass the field to sort by.
So, this would be sort_by=None
and the calling function would pass the field they want to sort by.
I think that simplifies this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored this a bit to use a sort_by
field instead (9089861). The task and management command themselves still don't have control over this and we will still default to sorting by the opened_on
field, however the util function will add this param depending on the doc count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow up @zandre-eng
I see there is some complexity, specifically in the management command that is trickling in the index util i.e some customization of query size and chunk size that we are not really using. May be we could stick to default values in the command itself or the main function doing the processing so the defaults don't need to be passed around everywhere.
Its non blocking though.
@mkangia My goal was to allow for customizability in the management command with regards to the chunk size and query limit, however I do agree that it is becoming tedious and unnecessary to pass these defaults everywhere. I've refactored this a bit so that defaults are simply used more throughout (93f259a). |
query = case_query_for_missing_geopoint_val( | ||
domain, geo_case_property, case_type, size=query_limit, offset=offset, should_sort=should_sort | ||
domain, geo_case_property, case_type, size=DEFAULT_QUERY_LIMIT, offset=offset, should_sort=should_sort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that size
was removed above in this commit for this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct, I'm simply using DEFAULT_QUERY_LIMIT
for size in the query. This was the default value being passed around by both the task and the management command, so I thought it makes more sense to simply just use this default instead of passing a param to case_query_for_missing_geopoint_val
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant to point that size was not a permitted kwarg anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it was fixed in a later commit, d2b15ce
corehq/apps/geospatial/management/commands/index_geolocation_case_properties.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The progress display to the user is neat.
(Please Ignore if already planned or done) Since there were refactors , would be wiise to get a sanity testing of both the task and managament commands (for records >10k) before merge.
(Please Ignore if already planned or done) Since there were refactors , would be wiise to get a sanity testing of both the task and managament commands (for records >10k) before merge. @ajeety4 Thanks for taking a pass. I have done a sanity check on staging using 20k records and everything is still working as expected. Just made a minor change to ensure that the progress calculation never goes over 100% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Product Description
When the Geospatial feature flag is enabled, the domain will receive the following alert on all Geospatial pages which show the progress of indexing the domain's location properties:
![Screenshot from 2024-09-16 11-36-30](https://private-user-images.githubusercontent.com/122617251/367706767-e7c25745-8886-428a-abf3-d6c39847bf44.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3NDc2NzYsIm5iZiI6MTczOTc0NzM3NiwicGF0aCI6Ii8xMjI2MTcyNTEvMzY3NzA2NzY3LWU3YzI1NzQ1LTg4ODYtNDI4YS1hYmYzLWQ2YzM5ODQ3YmY0NC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNlQyMzA5MzZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00MzA5YTNhNzAxZGE4OTJjNDQyNTFiODA4NWU0ZDFlMjA5YjY1ZTkwZGZiNDlmOGFlMjk2MGM3NTI2MzBjMTkwJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.NdP9F-NOtZ_5akBoHdmHJxHu6Sla_HPur2LA-HJwa4U)
If, however, the domain's case count is over the limit then the following error alert will be displayed instead:
![Screenshot from 2024-09-16 11-35-32](https://private-user-images.githubusercontent.com/122617251/367712314-6fe4aa63-3735-4b3e-b300-505588d76c84.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3NDc2NzYsIm5iZiI6MTczOTc0NzM3NiwicGF0aCI6Ii8xMjI2MTcyNTEvMzY3NzEyMzE0LTZmZTRhYTYzLTM3MzUtNGIzZS1iMzAwLTUwNTU4OGQ3NmM4NC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNlQyMzA5MzZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iMjZlNGZhYTlkZmNhODg3YjczYTYwZjZiYjJlZGYzZGVkMzM4NzkzOGU5ZGQ3N2VjMWIwMDYwNGU0NzhiODI3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.zYx7U2CQlIW8eBKzVHYpLuiQZxa-Blibc-UOxnJlDBM)
Technical Summary
Link to ticket here.
Link to tech spec here.
A new Celery task has been added which will be automatically started when the
GEOSPATIAL
feature flag is enabled for a domain. This new task essentially runs through the same logic as theindex_geolocation_case_properties
management command, where ES docs for a domain have their location case properties indexed to add a geopoint property. This is done so that these ES docs can be used within the context of the Geospatial feature.The
CeleryTaskTracker
model has been expanded to support the ability of storing a message in Redis, as well as having an error status. These are used to correctly show either the task progress or applicable error message to the user on the front-end.Feature Flag
GEOSPATIAL
Safety Assurance
Safety story
This change will only affect domains that enable the Geospatial feature flag. Additionally, an ES doc limit has been put in place so that the Celery task only starts indexing if the domain has cases fewer than the specified limit. The Celery task itself has also been set to run as a serial task, and so it is expected that no more than one of these tasks will be active at a time in the environment.
Automated test coverage
Automated tests exist for the new Celery task that has been added. Additionally, the tests for the
CeleryTaskHelper
class have been extended to account for the new changes there.QA Plan
QA has been completed. Ticket is available here.
Rollback instructions
Labels & Review