-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update growth stage and growth stage start date in FarmRegistry #407
base: main
Are you sure you want to change the base?
Conversation
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified FilesNo covered modified files...
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #407 +/- ##
==========================================
- Coverage 94.04% 94.04% -0.01%
==========================================
Files 259 259
Lines 15841 15910 +69
==========================================
+ Hits 14898 14962 +64
- Misses 943 948 +5 ☔ View full report in Codecov by Sentry. |
django_project/dcas/pipeline.py
Outdated
def update_farm_registry_growth_stage(self): | ||
"""Efficiently update growth stage in FarmRegistry.""" | ||
# Load Data | ||
grid_crop_df = self.data_query.read_grid_data_crop_meta_parquet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@osundwajeff this should be query to the farm parquet files like you did on error handling. It needs to accept year, month, and day from the DCASRequest.requested_at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to use chunks to avoid loading all data in the memory.
django_project/dcas/pipeline.py
Outdated
# Find Existing FarmRegistry Records | ||
farm_ids = list(farm_mapping.values()) # Get farm IDs | ||
existing_farm_registry = { | ||
fr.farm_id: fr for fr in FarmRegistry.objects.filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use 'registry_id' column from the parquet to filter by 'id__in'.
django_project/dcas/pipeline.py
Outdated
@@ -470,6 +541,7 @@ def run(self): | |||
self.process_grid_crop_data() | |||
|
|||
self.process_farm_registry_data() | |||
self.update_farm_registry_growth_stage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please call this method to celery task with input of request_id from DCASRequest
Fix #370