Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow ADI pipelines #2977

Open
jsbrittain opened this issue Mar 14, 2023 · 2 comments
Open

Slow ADI pipelines #2977

jsbrittain opened this issue Mar 14, 2023 · 2 comments
Assignees

Comments

@jsbrittain
Copy link
Contributor

No description provided.

@jsbrittain jsbrittain self-assigned this Mar 14, 2023
@jsbrittain
Copy link
Contributor Author

Also related to: #2551

@jsbrittain
Copy link
Contributor Author

jsbrittain commented Mar 21, 2023

There are several approaches and confounds to speed-up ingestion:

  • Ingest diff's instead of bulk uploads where able [Ingestion deltas for non-UUID sources #2975 ].
  • Prune database to remove partial upload cases. This has led to a runaway effect where cases are added but never set list=True or removed (since both require a recent successful upload to be identified), leading to further failed uploads. A quick check reveals that over 80% of cases in the DB are not currently accepted (many of these are likely to be duplicates of partial uploads).
  • Related to the previous point, extend timeouts on failing ingestions to allow a successful completion and trigger self-pruning.
  • Some sources (appears to be those without unique identifiers) fail on pruning ('document failed validation'). This appears to have been happening for some time and relates to deleting cases marked list=False from the database.
  • Issue Poor performance of ADI #2551 discusses mongoose as problematic in the data service.
  • Ongoing situation monitoring, relates to globaldothealth/covid19-ingestion-monitor#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant