-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite Dockerfile to run imports and healthcheck and run it on ECS #757
Comments
37 tasks
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-ops
that referenced
this issue
Feb 16, 2023
We used to run this in an EC2 instance with cron, but it's probably better managed along with everything else in k8s. See also edgi-govdata-archiving/web-monitoring-processing#757.
The healthcheck is now deployed to Kubernetes in edgi-govdata-archiving/web-monitoring-ops@624716a Will configure the import job similarly tomorrow. |
Mr0grog
added a commit
that referenced
this issue
Feb 16, 2023
To support running this job on an actual scheduled job runner that doesn't have persistent storage (see #757), we need to be able to store the unplaybackable cache in S3. You can now use 's3://' paths in the `--unplaybackable` option: wm import ia 'https://somewhere.com/' --unplaybackable 's3://bucket/unplaybackable.json'
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-ops
that referenced
this issue
Feb 16, 2023
Instead of running the import job as a cron script on a random EC2 VM, run it as an actual CronJob in Kubernetes with everything else. This also cleans up the docs around jobs. Work not visible here: created a new IAM account for jobs that can write to relevant S3 buckets, added ability to store cache files in S3 (edgi-govdata-archiving/web-monitoring-processing#849) since we have no persistent storage in Kubernetes. Why do this now? See: - edgi-govdata-archiving/web-monitoring#168 - edgi-govdata-archiving/web-monitoring-processing#757
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-ops
that referenced
this issue
Feb 16, 2023
Instead of running the import job as a cron script on a random EC2 VM, run it as an actual CronJob in Kubernetes with everything else. This also cleans up the docs around jobs. Work not visible here: created a new IAM account for jobs that can write to relevant S3 buckets, added ability to store cache files in S3 (edgi-govdata-archiving/web-monitoring-processing#849) since we have no persistent storage in Kubernetes. Why do this now? See: - edgi-govdata-archiving/web-monitoring#168 - edgi-govdata-archiving/web-monitoring-processing#757
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-ops
that referenced
this issue
Feb 17, 2023
Instead of running the import job as a cron script on a random EC2 VM, run it as an actual CronJob in Kubernetes with everything else. This also cleans up the docs around jobs. Why do this now? See: - edgi-govdata-archiving/web-monitoring#168 - edgi-govdata-archiving/web-monitoring-processing#757 Work not visible here: - Created a new IAM account for jobs that can write to relevant S3 buckets. - Added ability to store cache files in S3 (edgi-govdata-archiving/web-monitoring-processing#849) since we have no persistent storage in Kubernetes.
Well, Kubernetes configuration was kind of a mess, but it’s done. This is good to go. edgi-govdata-archiving/web-monitoring-ops#44 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There’s a Dockerfile in this repo, but it’s currently set up to run the diffing server, which we split off into a separate project a while back. The Dockerfile here is not defunct.
We should update this dockerfile to run our other scripts:
wm import ia
andia_healthcheck
. (The image should be set up so you can provide a command to run either one.) We currently run these as cron scripts in a manually managed server, but it would probably be better if these were a Docker image that can run on a schedule in ECS. We would need less manual fiddling with the server if so (at least in theory).The text was updated successfully, but these errors were encountered: