Skip to content

Commit

Permalink
Merge pull request #1139 from dandi/optimize-checksum-worker
Browse files Browse the repository at this point in the history
Optimize loop in checksum task
  • Loading branch information
mvandenburgh authored Jun 29, 2022
2 parents e830898 + a3cfaf4 commit d9aa126
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions dandiapi/api/tasks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,13 @@ def calculate_sha256(blob_id: int) -> None:
asset_blob.save()

# The newly calculated sha256 digest will be included in the metadata, so we need to revalidate
for asset in asset_blob.assets.all():
# validate_asset_metadata runs very quickly, no need to delay it
validate_asset_metadata(asset.id)
# Note, we use `.iterator` here and delay each validation as a new task in order to keep memory
# usage down.
for asset in asset_blob.assets.values('id').iterator():
# Note: while asset metadata is fairly lightweight compute-wise, memory-wise it can become
# an issue during serialization/deserialization of the JSON blob by pydantic. Therefore,
# we delay each validation to its own task.
validate_asset_metadata.delay(asset['id'])


@shared_task(queue='write_manifest_files')
Expand Down

0 comments on commit d9aa126

Please sign in to comment.