Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid allocations in Display for OsStr and Path #42613

Merged
merged 2 commits into from
Jun 17, 2017
Merged

Conversation

stepancheg
Copy link
Contributor

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice! The main thing seems to be that we should add some targeted unit tests for the new iterator-- I presume, perhaps wrongly, that there are existing tests for the Display code in paths...

fn fmt(&self, formatter: &mut fmt::Formatter) -> Result<(), fmt::Error> {
self.inner.fmt(formatter)
fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
fmt::Debug::fmt(&self.inner, formatter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just general cleanup, or was this required for some reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made several similar changes for these reasons:

  • it is easier to visually check that correct trait is called (Debug or Display)
  • to make sure code on Windows and Redox (e. g. os_str::Slice) works correctly. I do not have access to Windows machine, so it's easier to write code uniformly (single fmt import and explicit trait names) than wait for build system job completion
  • and after all to unify all implementations in stack os_str, OsStr, Path


/// Iterator over lossy UTF-8 string
#[unstable(feature = "str_internals", issue = "0")]
pub struct Utf8LossyIter<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we write some more unit tests for this? it seems like quite a big chunk of code, and one for which it'd be easy to write some targeted unit tests...


if i > 0 {
unsafe { res.as_mut_vec().extend_from_slice(&v[..i]) };
let mut res = String::with_capacity(v.len());
Copy link
Contributor

@arthurprs arthurprs Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know it'll be larger than v.len right? Maybe reserve some extra, like v.len() * 3 / 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. If yes, this should be done in separate PR. I've just kept this part as it was before my patch.

@stepancheg stepancheg force-pushed the lossy branch 2 times, most recently from 26644f6 to e242390 Compare June 13, 2017 12:28
@stepancheg
Copy link
Contributor Author

I've added tests (copied from String::from_utf8_lossy tests).

@stepancheg
Copy link
Contributor Author

Some tests broken, wait a bit...

@stepancheg stepancheg force-pushed the lossy branch 2 times, most recently from ac9935b to cbf82e5 Compare June 13, 2017 20:33
@stepancheg
Copy link
Contributor Author

stepancheg commented Jun 13, 2017

There's a tidy check error:

tidy error: /Users/yozh/devel/left/rust/src/libstd/ffi/os_str.rs:321: different `since` than before
tidy error: /Users/yozh/devel/left/rust/src/libstd/ffi/os_str.rs:685: different `since` than before

There were no Display for OsStr and OsString which are need to implement Display for Path and which I've added.

I've added

#[stable(feature = "rust1", since = "1.20.0")]

which is apparently incorrect.

What should I do?

@arielb1 arielb1 added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 13, 2017
@nikomatsakis
Copy link
Contributor

Going to pass the buck here. I'm not that familiar with the attributes that need to be set.

r? @alexcrichton

fmt::Debug::fmt(&**self, formatter)
}
}

#[stable(feature = "rust1", since = "1.20.0")]
impl fmt::Display for OsString {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did not exist before and is a new stable feature being added in this PR, is that intentional? It was the libs team intention that this impl did not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Display for OsString seems to be perfectly reasonable for me. You may need to "display" an OS string like you "display" the usual string.

Anywhay, if it should exist, how should I mute tidy?

If it shouldn't exist, how should I keep an implementation, but hide it from public API? #[doc(hidden)] and #[unstable(feature = "str_internals", issue = "0")]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not it's reasonable it is intentionally excluded and should not be added in this PR. If you'd like to add it then that should be done through a separate PR or an RFC.

This shouldn't exist, which means it should be deleted (along with most other Display impls added here)

@@ -1512,23 +1512,30 @@ impl Display for bool {
}
}

#[doc(hidden)]
#[unstable(feature = "str_internals", issue = "0")]
pub fn escape_str(s: &str, f: &mut Formatter) -> Result {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this avoid being exposed? It seems like a reasonable amount that's not too bad to just copy directly into the new iterator.

@bors
Copy link
Contributor

bors commented Jun 15, 2017

☔ The latest upstream changes (presumably #42648) made this pull request unmergeable. Please resolve the merge conflicts.

@stepancheg
Copy link
Contributor Author

Updated the PR.

  • Removed public Display implementations (kept implementations for private types in os_str)
  • Copied implementation of Debug for str to Debug for Utf8Lossy instead of extraction of escape_str function

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Jun 16, 2017

📌 Commit ac96fd7 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Jun 17, 2017

⌛ Testing commit ac96fd7 with merge 3438c0f...

bors added a commit that referenced this pull request Jun 17, 2017
Avoid allocations in Display for OsStr and Path

#38879
@bors
Copy link
Contributor

bors commented Jun 17, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 3438c0f to master...

@bors bors merged commit ac96fd7 into rust-lang:master Jun 17, 2017
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 14, 2017
Historically many `Display` and `Debug` implementations for `OsStr`-like
abstractions have gone through `String::from_utf8_lossy`, but this was updated
in rust-lang#42613 to use an internal `Utf8Lossy` abstraction instead. This had the
unfortunate side effect of causing a regression (rust-lang#43765) in code which relied on
these `fmt` trait implementations respecting the various formatting flags
specified.

This commit opportunistically adds back interpretation of formatting trait flags
in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can
delegate to the formatting implementation for `str`. This doesn't entirely solve
the regression as non-utf8 paths will format differently than they did before
still (in that they will not respect formatting flags), but this should solve
the regression for all "real world" use cases of paths and such. The door's also
still open for handling these flags in the future!

Closes rust-lang#43765
bors added a commit that referenced this pull request Aug 23, 2017
std: Respect formatting flags for str-like OsStr

Historically many `Display` and `Debug` implementations for `OsStr`-like
abstractions have gone through `String::from_utf8_lossy`, but this was updated
in #42613 to use an internal `Utf8Lossy` abstraction instead. This had the
unfortunate side effect of causing a regression (#43765) in code which relied on
these `fmt` trait implementations respecting the various formatting flags
specified.

This commit opportunistically adds back interpretation of formatting trait flags
in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can
delegate to the formatting implementation for `str`. This doesn't entirely solve
the regression as non-utf8 paths will format differently than they did before
still (in that they will not respect formatting flags), but this should solve
the regression for all "real world" use cases of paths and such. The door's also
still open for handling these flags in the future!

Closes #43765
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 23, 2017
Historically many `Display` and `Debug` implementations for `OsStr`-like
abstractions have gone through `String::from_utf8_lossy`, but this was updated
in rust-lang#42613 to use an internal `Utf8Lossy` abstraction instead. This had the
unfortunate side effect of causing a regression (rust-lang#43765) in code which relied on
these `fmt` trait implementations respecting the various formatting flags
specified.

This commit opportunistically adds back interpretation of formatting trait flags
in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can
delegate to the formatting implementation for `str`. This doesn't entirely solve
the regression as non-utf8 paths will format differently than they did before
still (in that they will not respect formatting flags), but this should solve
the regression for all "real world" use cases of paths and such. The door's also
still open for handling these flags in the future!

Closes rust-lang#43765
withoutboats pushed a commit to withoutboats/rust that referenced this pull request Apr 11, 2018
In [RFC rust-lang#474][rfc474], the issue of how to handle Displaying a
Path was left as an open question. The problem is that a Path
may contain non-UTF-8 data on most platforms. In the
implementation of the RFC, a `display` method was added, which
returns an adapter that implements `Display` by replacing
non-UTF8 data with a unicode replacement character.

Though I can't find a record of the discussion around this
issue, I believe there were two primary reasons not to just
implement this behavior as the `Display` impl of `Path`:

1. The adapter allocated in the non-UTF8 case, and Rust as a
   rule tries to avoid allocations that are not explicit in code.
2. The user may prefer an alternative solution than using the
   unicode replacement character for handling non-UTF8 data.

In my view, the choice to provide an adapter rather than
implement Display has had a high cost in terms of user experience:

* I almost never remember that I need an adapter, forcing me to
  go back and edit my code after compiling it and getting an error.
* It makes my code more noisy to have the display adapter; this
  detail is rarely important.
* It is extremely uncommon to actually do something other than call
  the display adapter when trying to display a path (I have never
  wanted anything else myself).
* For new users, it is Yet Another Compiler Error that they have to
  figure out how to solve, contributing to the sense that Rust nags
  nags and obstructs rather than assists & guides.

Therefore, I think time has shown that this has been a detriment to
user experience, rather than a helpful reminder. That leaves only
the first reason not to implement this: implicit allocations. That
problem was happily resolved in June 2017: rust-lang#42613 provided an
alternative implementation which efficiently avoids allocations.

Given that, I think it is time that we implement `Display` for both
`Path` and `PathBuf` and deprecate the `Path::display` method.

r? @alexcrichton
cc @rust-lang/libs

[rfc474]: https://github.com/rust-lang/rfcs/blob/master/text/0474-path-reform.md)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants