Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add solr restarter service to autoheal solr #5989

Merged
merged 4 commits into from
Jan 10, 2022

Conversation

cdrini
Copy link
Collaborator

@cdrini cdrini commented Dec 18, 2021

Closes #5343

This service responds to errors on the production endpoint, and restarts solr accordingly. This is ideally code we shouldn't have to write, but couldn't find how to make this work with docker's healthcheck -- it seemed like that was hard blocked for docker-compose users: https://forums.docker.com/t/unhealthy-container-does-not-restart/105822/3 . Since our healthcheck involves processing some JS as well, created a whole script for it :(

Technical

Testing

  • Tested a near-identical version on prod solr for the past few weeks, and it's been working well!
  • Pull on ol-solr0 and try running
  • To test locally:
  1. Modify docker-compose.production.yml, and comment out env_file and SEND_SLACK_MESSAGE=true
  2. Run COMPOSE_FILE="docker-compose.yml;docker-compose.production.yml" docker-compose --profile ol-solr0 up -d
  3. Everything should start
  4. Check the logs of solr_restarter ; should be saying "healthy".
  5. Modify the TEST_URL to be q=asdflkjaslkefj (some gibberish)
  6. up again
  7. Observer it restarts after 3 health fails

Screenshot

Stakeholders

This service responds to errors on the production endpoint, and restarts solr accordingly.
@cdrini cdrini force-pushed the 5343/feature/solr-restarter branch from 5bf9edb to 8598dac Compare December 18, 2021 01:40
This is necessary now, since solr_restarter builds its image instead of pulling it down
@mekarpeles mekarpeles self-assigned this Dec 20, 2021
@mekarpeles mekarpeles added Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. labels Dec 20, 2021
@mekarpeles mekarpeles merged commit 33b9b00 into internetarchive:master Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. Priority: 1 Do this week, receiving emails, time sensitive, . [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Occasional 503s/slowdowns; likely solr 8 related
2 participants