Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Dumps not auto-generated for June 2024 #9521

Closed
neilt opened this issue Jul 3, 2024 · 5 comments · Fixed by #9538
Closed

Data Dumps not auto-generated for June 2024 #9521

neilt opened this issue Jul 3, 2024 · 5 comments · Fixed by #9538
Assignees
Labels
Lead: @mekarpeles Issues overseen by Mek (Staff: Program Lead) [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Type: Bug Something isn't working. [managed] Type: Post-Mortem Log for when having to resolve a P0 issue

Comments

@neilt
Copy link

neilt commented Jul 3, 2024

Problem

As of today July 3, https://archive.org/details/ol_exports?sort=-publicdate does not show a June 2024 data dump.

And https://openlibrary.org/data/ol_dump_latest.txt.gz is still downloading ol_dump_2024-05-31.txt.gz

The dumps usually generate by the 1st or 2nd day of the next month.

Reproducing the bug

  1. Go to ...
  2. Do ...
  • Expected behavior:
  • Actual behavior:

Context

  • Browser (Chrome, Safari, Firefox, etc): Safari
  • OS (Windows, Mac, etc): macOS
  • Logged in (Y/N): N
  • Environment (prod, dev, local): prod

Notes from this Issue's Lead

Proposal & constraints

Related files

Stakeholders


Instructions for Contributors

  • Please run these commands to ensure your repository is up to date before creating a new branch to work on this issue and each time after pushing code to Github, because the pre-commit bot may add commits to your PRs upstream.
@neilt neilt added Needs: Lead Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Type: Bug Something isn't working. [managed] labels Jul 3, 2024
@tfmorris
Copy link
Contributor

tfmorris commented Jul 3, 2024

The dump program was recently changed, so this could be related. #9127

@github-actions github-actions bot added the Needs: Response Issues which require feedback from lead label Jul 3, 2024
@mekarpeles mekarpeles added the Type: Post-Mortem Log for when having to resolve a P0 issue label Jul 5, 2024
@mekarpeles
Copy link
Member

While running through our diagnosing cron failures guide (https://github.com/internetarchive/olsystem/wiki/Crons#diagnosing-cron-failures) we discovered:

DEBUG    : stats.py    :  46 :  Postgres Database : coverstore
Exception ignored in atexit callback: <function AtexitIntegration.setup_once.<locals>._shutdown at 0x7fa304df2660>
Traceback (most recent call last):
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/integrations/atexit.py", line 61, in _shutdown
    client.close(callback=integration.callback)
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/client.py", line 580, in close
    self.flush(timeout=timeout, callback=callback)
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/client.py", line 604, in flush
    self.transport.flush(timeout=timeout, callback=callback)
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/transport.py", line 525, in flush
    self._worker.submit(lambda: self._flush_client_reports(force=True))
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/worker.py", line 117, in submit
    self._ensure_thread()
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/worker.py", line 42, in _ensure_thread
    self.start()
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/worker.py", line 70, in start
    self._thread.start()
  File "/home/openlibrary/.local/lib/python3.12/site-packages/sentry_sdk/integrations/threading.py", line 56, in sentry_start
    return old_start(self, *a, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/threading.py", line 992, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't create new thread at interpreter shutdown

@mekarpeles mekarpeles added Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Lead: @mekarpeles Issues overseen by Mek (Staff: Program Lead) [managed] and removed Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Needs: Lead Needs: Response Issues which require feedback from lead labels Jul 5, 2024
@cdrini
Copy link
Collaborator

cdrini commented Jul 8, 2024

This was also added which might cause the error: #9369

@mekarpeles mekarpeles self-assigned this Jul 8, 2024
@mekarpeles
Copy link
Member

@mekarpeles to manually run data dumps script according to
https://github.com/internetarchive/olsystem/wiki/Crons#monthly-data-dumps

@mekarpeles
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @mekarpeles Issues overseen by Mek (Staff: Program Lead) [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Type: Bug Something isn't working. [managed] Type: Post-Mortem Log for when having to resolve a P0 issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants