Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up stopping containers #417

Merged
merged 3 commits into from
Mar 24, 2023
Merged

Speed up stopping containers #417

merged 3 commits into from
Mar 24, 2023

Conversation

lognaturel
Copy link
Member

@lognaturel lognaturel commented Mar 22, 2023

Closes #416

What has been done to verify that this works as intended?

Did a docker compose build docker compose up -d docker compose stop after each commit to make sure each was minimal and effective. It's still on there if someone wants to do a sanity check.

Why is this the best possible solution? Were any other approaches considered?

The underlying issue is that signals weren't being proxied to the intended processes. More at https://stackoverflow.com/a/32261019/137744. This was addressed by running all the various processes with exec.

Additionally, the Enketo dockerfile needed to be updated to use the JSON-array syntax so it wouldn't be called with /bin/sh -c which is immune to SIGTERM

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

This is an ops change. Worst case would be things don't stop or come up. I think if it works on dev it's likely to work anywhere.

I'm not entirely sure about risk beyond things not stopping or coming up. I keep seeing things about the PID 1 zombie reaping problem but I can't tell if this changes anything with respect to that. This post suggests that it doesn't change anything. This tini post has more about zombie processes. Given my understanding of the processes we launch, I believe it's unlikely that any secondary processes likely to become zombies are being launched.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

Before submitting this PR, please make sure you have:

  • verified that any code or assets from external sources are properly credited in comments or that everything is internally sourced

@lognaturel lognaturel requested a review from sadiqkhoja March 22, 2023 17:01
@lognaturel lognaturel changed the title Speed stop Speed up stopping containers Mar 22, 2023
Copy link
Contributor

@sadiqkhoja sadiqkhoja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgres14 container is taking 10s for me. I think SIGTERM is not being passed to the postgres process

@lognaturel
Copy link
Member Author

Good catch about the db container. Let me look at that separately because it doesn't affect the Cloud setup and I'd like to make sure I understand that process specifically.

Are you feeling pretty confident that this is low-risk? Like I wrote in the risks section, I am not entirely sure that there aren't possible side effects I haven't considered.

@sadiqkhoja
Copy link
Contributor

I don't see any risk of zombie processes here as we are not spawning processing on the runtime.

@lognaturel
Copy link
Member Author

Have talked it through with @yanokwa and am back to feeling like it's low-risk. Moreover, I don't think we have a great way of identifying possible issues other than putting it out there on different systems.

@lognaturel
Copy link
Member Author

Oh great, thanks @sadiqkhoja.

@lognaturel lognaturel merged commit 4b818de into getodk:next Mar 24, 2023
@lognaturel lognaturel deleted the speed-stop branch March 24, 2023 21:40
@matthew-white matthew-white linked an issue Mar 25, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enketo, service and nginx containers are slow to stop
2 participants