-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close file descriptors for redo process #1834
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment somewhere to explain why it's important to close the file descriptors:
- It's a security issue if we don't. The WAL process is a sandbox that's not supposed to be able to access anything in the parent process.
- The concrete problem with the lock file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I believe this solution is suboptimal. If the goal is to fix just #1814, it would suffice to ensure that somebody sets the One could argue that there may be more fds like that. Fortunately, std already uses
|
Yes #1814 (comment) Short term: Would you rather deal with unreleased commits from daemonize, or this new close_fds crate? We might need this crate anyway for the pre_exec assertion that you mention, unless there's a simpler way to correctly iterate open fds in a multi-thread program. Or alternatively we can do this check after exec, in the postgres code. Long term: Moving away from daemonize also fixes #1840, among other things. It's a bigger project though, we shouldn't block |
merging this since it's a short-term improvement, but we can continue the daemonize discussion here #1841 |
This PR adds a test for #1834 and fixes the error in https://app.circleci.com/pipelines/github/neondatabase/neon/7753/workflows/94d1b796-10a3-4989-b23c-4c1eb4a49cf5/jobs/79586, which happens because `pageserver.pid` is held by `initdb` command on restart. Because the test requires `lsof` to be installed in the docker image, this PR also updates the caches and docker image specified in CircleCI config file.
Fixes #1814
Tested manually using
lsof .zenith/pageserver.pid