Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more telemetry to the monthly ol_dump process #6617

Merged

Conversation

cclauss
Copy link
Contributor

@cclauss cclauss commented Jun 1, 2022

Add more debug information to the ol-dump process based on the lessons learned in the 2022-05-31 dumps.

  1. yymm=$(date +%Y-%m) --> yyyymm=${yyyymmdd:0:7} # So we get 2022-05 instead of 2022-06
  2. Log using the prefix 2022-06-01 15:25:06 [openlibrary.dump] in bash and Python to simplify grepping the logs.
    • ol-home0% docker logs -f openlibrary_cron-jobs_1 2>&1 | grep openlibrary.dump
  3. More consistently log the arguments being passed.
  4. Log whether or not Sentry is properly enabled.
  5. Log timing of major processing steps to quickly spot which jobs are not functioning correctly.

Technical

Testing

Screenshot

Stakeholders

@cclauss cclauss added Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Module: Data dumps labels Jun 1, 2022
@cclauss cclauss requested review from mekarpeles and cdrini June 1, 2022 13:36
@mekarpeles mekarpeles merged commit f522116 into internetarchive:master Jun 1, 2022
@mekarpeles mekarpeles self-assigned this Jun 1, 2022
@cclauss cclauss deleted the ol_dump_improvements_2022_06_01 branch June 1, 2022 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Data dumps Priority: 1 Do this week, receiving emails, time sensitive, . [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants