Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ol-www0: Add cron job pull-sitemaps-from-ol-home0 #7781

Merged
merged 3 commits into from
Apr 18, 2023

Conversation

cclauss
Copy link
Contributor

@cclauss cclauss commented Apr 13, 2023

Closes #7580

Add a second cron job to the openlibrary_web_nginx Docker container which runs on the host ol-www0. This job runs at 8 pm on the first day of every month and pulls the newly created sitemaps from the host ol-home0.

  • The pull operation is performed by rsync which must be added to the openlibrary_web_nginx Docker container.
  • The modification to docker/ol-nginx-start.sh concatenates multiple files into one before feeding them to crontab.

Technical

Testing

On ol-www0...
rsync --version
docker exec -it openlibrary_web_nginx_1 bash # --> docker container

rsync --version
cd /sitemaps
ls
rm -r previous_sitemaps
rm rsync.log
apt-get update && apt-get install rsync # https://internetarchive.slack.com/archives/GM13CHXBP/p1681428713708969
vi /olsystem/etc/cron.d/pull-sitemaps-from-ol-home0

copy the line and modify the new line to start with X * * * * where X is an upcoming minute.

CRONTAB_FILES="/etc/cron.d/archive-webserver-logs /etc/cron.d/pull-sitemaps-from-ol-home0"
cat $CRONTAB_FILES | crontab -
crontab -l

Ensure that all $CRONTAB_FILES content is present

service cron start

Wait until X

ls

Look for previous_sitemaps and rsync.log and scan the log to ensure no errors.
Revert all changes made to /etc/cron.d/pull-sitemaps-from-ol-home0

cat $CRONTAB_FILES | crontab -
crontab -l

Ensure that all $CRONTAB_FILES content is present

service cron start

Screenshot

Stakeholders

olsystem cron jobs in /etc/cron.d

filename run on in Docker container purpose last modified
archive-webserver-logs ol-covers0 covers_nginx archive-webserver-logs 2 years
archive-webserver-logs ol-www0 web_nginx archive-webserver-logs 2 years
mrtg 12 years
openlibrary.allnodes ol-www1, bare metal Copy sitemaps 2 years
openlibrary.ol_home0 ol-home0 cron-jobs Monthly data dumps 3 months
pg-backups ol-home bare metal Backup postgres 9 years
pull-sitemaps-from-ol-home0 ol-www0 web_nginx pull sitemaps monthly now

@cclauss cclauss requested a review from cdrini April 13, 2023 23:23
@cclauss cclauss changed the title ol-www0: Add pull-sitemaps-from-ol-home0 cron job ol-www0: Add cron job pull-sitemaps-from-ol-home0 Apr 14, 2023
@cclauss cclauss added Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Affects: Admin/Maintenance Issues relating to support scripts, bots, cron jobs and admin web pages. [managed] Theme: Provisioning labels Apr 17, 2023
@mekarpeles mekarpeles merged commit 80f4cc7 into master Apr 18, 2023
@mekarpeles mekarpeles deleted the pull-sitemaps-from-ol-home0 branch April 18, 2023 15:44
Eds-Dbug pushed a commit to Eds-Dbug/openlibrary that referenced this pull request Apr 28, 2023
* ol-www0: Add pull-sitemaps-from-ol-home0 cron job
* Concatinate multiple files into one before feeding them to crontab
* apt-get install rsync
@cclauss
Copy link
Contributor Author

cclauss commented May 3, 2023

Confirmed that ol-www0 cron pulled the 2023-Apr-30 sitemaps from ol-home0. 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Affects: Admin/Maintenance Issues relating to support scripts, bots, cron jobs and admin web pages. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Theme: Provisioning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confirm sitemaps are being copied to www0
2 participants