Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Update docker image to permit running as non-root and other tweaks for complement build #11431

Conversation

michaelkaye
Copy link
Contributor

I'm attempting to get the complement-perf image running on an openshift cluster, one part of which is that the containers should not run as root by default. This is kinda rolling all the way back to the source images in synapse.

The main change removes the requirement to run as root but when testing locally and under complement the containers continue to run fine.

To attempt to run the container as non-root, try adding -u 1000:0 to a docker run line; pick a random 1000 and it should continue to function.

I think the first commits (except for 67e20f2 ) to be mainly uncontroversial, but that last commit to maybe raise some eyebrows.

I've tested the containers by running the complement.sh script and that works fine; if there's other places I should test i'm not breaking things then pointers welcome, this is probably a subtle change permissions change that might break downstream users.

This also tidies up gitignore and updates an environment variable to remove a warning when running the scripts-dev/complement.sh.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

These are a download of the complement server when COMPLEMENT_DIR env is not set
and are test-time only.
This was changed in matrix-org/complement#232 and the result should be
identical (50ms * 500 == 25s)
…GID=0

This permits the container to be started in systems like OpenShift which do not
permit UID=0 by default.

VOLUME in Dockerfile-worker is removed as it inherits from the one in Dockerfile
Avoids the containers being marked unhealthy as the HEALTHCHECK from
matrixdotorg/synapse is inherited by this container.
Without root, supervisord cannot assume other users, so remove any
changing of users.
@michaelkaye michaelkaye marked this pull request as ready for review November 26, 2021 10:11
@michaelkaye michaelkaye requested a review from a team as a code owner November 26, 2021 10:11
@michaelkaye
Copy link
Contributor Author

I don't believe the failed sytest is related to my changes, given other similar tests were successful

@babolivier
Copy link
Contributor

I don't believe the failed sytest is related to my changes, given other similar tests were successful

I've restarted the run, since I agree it doesn't look related, so it looks like a flaky test to me.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems generally sensible to me

Comment on lines +90 to +91
RUN chgrp -R 0 /conf /data && \
chmod -R g+rw /conf /data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly surprised this is needed. What is the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chmod -R g+w is the only absolutely required change; normally these are u+rw ; g+r

I duplicated the change from synapse-workers where there are cases of things being different; it seemed to me to be a good way to clearly state what was intended, rather than rely on the default not changing. It doesn't make much change to the build speed.

@@ -18,7 +18,7 @@ server {
{{ worker_locations }}

# Send all other traffic to the main process
location ~* ^(\\/_matrix|\\/_synapse) {
location ~* ^(\\/_matrix|\\/_synapse|\\/health) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we instead arrange for the healthcheck to go to the workers? It seems a bit arbitrary just to send it to the main process.

(that might also be useful for complement to know when all the workers are ready, cf #10065 (comment))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a better attempt at this change in #11429 but left this in as atm this does respond on the synapse image and i thought it should do something on the synapse-workers image.

Happy to revert it as it isn't a complete health check

@michaelkaye
Copy link
Contributor Author

Having worked a bit more on this I'm not 100% sure if we actually need all these changes; complement-style images aren't actually that good directly in a kubernetes style environment; the database lives in the container it will be removed whenever a failover happens - so I think i need to cherry pick the useful bits out.

@michaelkaye
Copy link
Contributor Author

Closing and moving useful pieces to #11718

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants