-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct usage note on OpenOptions::append() #120781
Conversation
This comment has been minimized.
This comment has been minimized.
I think this is now overcompensating, going too far in the other direction. It is still necessary to coalesce buffers into a single For example if multiple threads are logging to a single file then tearing may be tolerable in edge-cases but it's preferable that it doesn't happen. So it'll still be important for the logger to perform a single |
The problem is that there is absolutely no documented guarantee for this, nor for how large a write is "short". Maybe |
At least on linux length of the write primarily matters for pipes. For disk-backed filesystems most of them will (afaik) even take a gigabyte-sized write buffer and try to make that append atomic. Which might be good enough for some uses (logging) but not for others (database journal).
I don't think that'd help anyone. Instead we should describe the properties. That it's only atomic for a single write and that whether writes get cut short will be platform and file-dependent. And that for some uses coalescing multiple writes into one can be "good enough" and that for more stringent requirements locking or careful study of platform-specific documentation would be necessary. |
Instead of the "less useful than it appears" wording, how about something like this? "However, this does not necessarily guarantee that data appended by different processes or threads does not interleave. The amount of data accepted a single |
@m-ou-se Thanks for the suggestion. I've now toned down the paragraph by:
I can edit it further as needed. @the8472 please let me know if this looks better. I'm not sure if you insist that we explicitly state that interleaving is "good enough" for some cases - I'd expect the user to understand that anyway. The current wording tries to explain the properties impartially, hopefully fulfilling the spirit of your suggestion. |
library/std/src/fs.rs
Outdated
/// successful `write()` is allowed to write only part of the given data, so even if | ||
/// you're careful to provide the whole message in a single call to `write()`, there | ||
/// is no guarantee that it will written out in full. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still steer users towards coalescing their writes (as the previous version did), even if it's not sufficient for some use-cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now restored a mention of coalescing of writes.
☔ The latest upstream changes (presumably #123945) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay. This looks good now. (Other than that one extra `
.)
Can you rebase it and squash your commits?
@m-ou-se I'll do so shortly, thanks for the review. |
Avoid implying that concatenating data before passing it to `write()` (with or without `BufWriter`) ensures atomicity.
I've now rebased and squashed the commits, and fixed a typo ("will written out" -> "will be written out"). |
@bors r+ rollup |
…llaumeGomez Rollup of 14 pull requests Successful merges: - rust-lang#120781 (Correct usage note on OpenOptions::append()) - rust-lang#121694 (sess: stabilize `-Zrelro-level` as `-Crelro-level`) - rust-lang#122521 (doc(bootstrap): add top-level doc-comment to utils/tarball.rs) - rust-lang#123491 (Fix ICE in `eval_body_using_ecx`) - rust-lang#123574 (rustdoc: rename `issue-\d+.rs` tests to have meaningful names (part 6)) - rust-lang#123687 (Update ar_archive_writer to 0.2.0) - rust-lang#123721 (Various visionOS fixes) - rust-lang#123797 (Better graphviz output for SCCs and NLL constraints) - rust-lang#123990 (Make `suggest_deref_closure_return` more idiomatic/easier to understand) - rust-lang#123995 (Make `thir_tree` and `thir_flat` into hooks) - rust-lang#123998 (Opaque types have no namespace) - rust-lang#124001 (Fix docs for unstable_features lint.) - rust-lang#124006 (Move size assertions for `mir::syntax` types into the same file) - rust-lang#124011 (rustdoc: update the module-level docs of `rustdoc::clean`) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#120781 - hniksic:master, r=m-ou-se Correct usage note on OpenOptions::append() This PR aims to correct the following usage note in `OpenOptions::append()`, which currently contains misleading information: > One maybe obvious note when using append-mode: make sure that all data that belongs together is written to the file in one operation. This can be done by concatenating strings before passing them to [write()](https://doc.rust-lang.org/std/io/trait.Write.html#tymethod.write), or using a buffered writer (with a buffer of adequate size), and calling [flush()](https://doc.rust-lang.org/std/io/trait.Write.html#tymethod.flush) when the message is complete. The above is misleading because, despite appearances, neither concatenating data before passing it to `write()`, nor delaying writes using `BufWriter`, ensures atomicity. `File::write()`, as well as the underlying `write(2)` system call, makes no guarantees that the data passed to it will be written out in full. It is allowed to write out only a part of the data, and has a return value that tells you how much it has written, at which point it has already returned and modified the file with partial data. Given this limitation, the only way to ensure atomicity of appends is through external locking. Attempting to ensure atomicity by issuing data in a single `write()` is a footgun often stumbled upon by beginners, which shouldn't be advertised in the docs. The worst thing about the footgun is that it *appears* to work at first, only failing when the string becomes sufficiently large, or when some internal properties of the output file descriptor change (e.g. it is switched from regular file to a special file that talks to a socket or TTY), making it accept smaller writes. Additionally, the suggestion to use `BufWriter` skims over the issue of buffer sizes, as well as the fact that `BufWriter::flush()` contains a *loop* that can happily issue multiple writes. This loop is completely opaque to the caller, so you can't even assert atomicity after-the-fact. The PR makes the following changes: * removes the paragraph that suggests concatenating strings to pass them to `write()` for atomicity or using `BufWriter` * adds a paragraph explaining why attempting to use `write()` to append atomically is not a good idea.
This PR aims to correct the following usage note in
OpenOptions::append()
, which currently contains misleading information:The above is misleading because, despite appearances, neither concatenating data before passing it to
write()
, nor delaying writes usingBufWriter
, ensures atomicity.File::write()
, as well as the underlyingwrite(2)
system call, makes no guarantees that the data passed to it will be written out in full. It is allowed to write out only a part of the data, and has a return value that tells you how much it has written, at which point it has already returned and modified the file with partial data. Given this limitation, the only way to ensure atomicity of appends is through external locking.Attempting to ensure atomicity by issuing data in a single
write()
is a footgun often stumbled upon by beginners, which shouldn't be advertised in the docs. The worst thing about the footgun is that it appears to work at first, only failing when the string becomes sufficiently large, or when some internal properties of the output file descriptor change (e.g. it is switched from regular file to a special file that talks to a socket or TTY), making it accept smaller writes. Additionally, the suggestion to useBufWriter
skims over the issue of buffer sizes, as well as the fact thatBufWriter::flush()
contains a loop that can happily issue multiple writes. This loop is completely opaque to the caller, so you can't even assert atomicity after-the-fact.The PR makes the following changes:
write()
for atomicity or usingBufWriter
write()
to append atomically is not a good idea.