Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gfd] Fixing logic of atomically file writing #792

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

belo4ya
Copy link
Contributor

@belo4ya belo4ya commented Jun 29, 2024

fix #791

The current implementation of the writeFileAtomically function contains a bug - the os.Chmod function is called after os.Rename, causing a moment when the target file (features.d/gfd) has incorrect permissions (0600 instead of 0644):

E0629 17:29:25.340196       1 local.go:266] source local failed reading file 'gfd': open /etc/kubernetes/node-feature-discovery/features.d/gfd: permission denied

To fix this issue, the order of the calls needs to be changed to: os.Chmod -> os.Rename

@belo4ya
Copy link
Contributor Author

belo4ya commented Jun 29, 2024

What do you think about using github.com/google/renameio instead of our own implementation of atomic file writing?

import "github.com/google/renameio"

func writeFileAtomically(path string, contents []byte, perm os.FileMode) error {
	return renameio.WriteFile(path, contents, perm)
}

github.com/google/renameio is stable, has no dependencies, is quite popular (used by 4.3k), and performs its task well.

@elezar
Copy link
Member

elezar commented Jul 3, 2024

@belo4ya thanks for the fix. I think we would be ok with using renameio. Feel free to update this PR to switch to that instead. Alternatively we can do so as a follow-up.

@belo4ya belo4ya force-pushed the fix/gfd-atomically-file-writing branch from 5857ab0 to e4fdcba Compare July 3, 2024 15:51
@belo4ya
Copy link
Contributor Author

belo4ya commented Jul 3, 2024

Done. I replaced the function writeFileAtomically with renameio.WriteFile.

The only thing is, the temporary file will be created in $TMPDIR instead of /etc/kubernetes/node-feature-discovery/features.d/gfd-tmp/ (I don't think this is a problem, but it's worth mentioning).

elezar
elezar previously approved these changes Jul 9, 2024
Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix @belo4ya.

@elezar elezar dismissed their stale review July 9, 2024 09:24

Re-review.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to:

The only thing is, the temporary file will be created in $TMPDIR instead of /etc/kubernetes/node-feature-discovery/features.d/gfd-tmp/ (I don't think this is a problem, but it's worth mentioning).

Let's squash these two commits into a single commit and include this note in the commit message along with additional context on the fixed bug (something like your description would be great).

@belo4ya belo4ya force-pushed the fix/gfd-atomically-file-writing branch 3 times, most recently from 275a44d to a608f7f Compare July 9, 2024 23:27
@belo4ya
Copy link
Contributor Author

belo4ya commented Jul 10, 2024

@elezar, done.

fix NVIDIA#791 Corrected the incorrect order of operations in atomic writing of the feature file, which could lead to a "permission denied" error observed in the nfd-worker logs

note: now the temporary file is created in $TMPDIR instead of /etc/kubernetes/node-feature-discovery/features.d/gfd-tmp/

Signed-off-by: belo4ya <41exey.kov41ev@gmail.com>
@elezar elezar force-pushed the fix/gfd-atomically-file-writing branch from a608f7f to 442edd4 Compare July 16, 2024 12:00
@elezar elezar merged commit 1f3d112 into NVIDIA:main Jul 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[gfd] Incorrect implementation of atomic writing to a file when exporting features by gpu-feature-discovery
2 participants