Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend format_args implicit arguments to allow field access #3626

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

joshtriplett
Copy link
Member

@joshtriplett joshtriplett commented May 6, 2024

This RFC extends the "implicit named arguments" mechanism to allow accessing
field names with var.field syntax: format!("{self.x} {var.another_field}").

Rendered

@joshtriplett joshtriplett added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 6, 2024
@ChrisDenton
Copy link
Member

Why wasn't this in the original RFC? Were there concerns or was it just a case of being cautiously minimalistic?

@kennytm
Copy link
Member

kennytm commented May 6, 2024

@ChrisDenton Yes. See https://rust-lang.github.io/rfcs/2795-format-args-implicit-identifiers.html#alternative-solution---interpolation and also the Future Possibilities section of 2795.

@joshtriplett joshtriplett changed the title RFC to extend format_args implicit arguments to allow field access Extend format_args implicit arguments to allow field access May 6, 2024
@fbstj
Copy link
Contributor

fbstj commented May 6, 2024

does there need to be a discussion about the reaction to .await (and any other future postfix keywords / macros / etc)?

@ChrisDenton
Copy link
Member

@ChrisDenton Yes. See https://rust-lang.github.io/rfcs/2795-format-args-implicit-identifiers.html#alternative-solution---interpolation and also the Future Possibilities section of 2795.

Thanks! Maybe I'm missing something but there doesn't appear to be any discussion on that in this RFC? I only see a terse reference to it in the disadvantages section?

Also there's no mention of the exploration of alternative syntax mentioned in rfc2795's Future Possibilities. Even if that's not being considered in this RFC, it would be nice to at least acknowledge that.

@programmerjake
Copy link
Member

I think allowing .await in format arguments is a bad idea because .await commonly has highly non-trivial side effects that are obscured by being in a format string, Deref isn't really a problem because it's already bad style for Deref to have significant side effects. Also, allowing .await just because it kinda looks like field access is imho an overgeneralization, it is nothing like field access.

@Jules-Bertholet
Copy link
Contributor

Jules-Bertholet commented May 6, 2024

One property that current format strings have is that nothing in the format string can typically have side effects (other than a strange Debug/Display impl). Allowing .await would break this property. Maybe that's fine, but it should at least be noted. (Technically a Deref impl could also have a side effect, but as with Debug/Display that's atypical.)

@clarfonthey
Copy link
Contributor

clarfonthey commented May 6, 2024

Also of the opinion that await should be disallowed, but that Deref is probably fine, although…

With Deref, there is the chance of interior mutability being exploited, and this can definitely result in some unexpected errors. For example, calling Deref on a Ref from a RefCell could induce a panic, and similarly there could be other cases where Deref does some meaningful calculation, even if they don't have to. It would be unprecedented, but perhaps we could require that the types implementing Deref have no interior mutability (Freeze) and require those specifically to be effectively pure. Of course, this would mean that Freeze is an API contract, which I know folks have been trying to avoid.

Or, if we decide against that, I would strongly recommend just making the evaluation order explicitly linear, with Deref being called each time as if you had a series of let statements. This feels the most reasonable and intuitive.

@programmerjake
Copy link
Member

With Deref, there is the chance of interior mutability being exploited, and this can definitely result in some unexpected errors.

this is exactly the same as accessing interior mutability in Display...you can run whatever code you like but people expect the implementations to have no effect other than writing the output or, for Deref, returning the reference

For example, calling Deref on a Ref from a RefCell could induce a panic

it can't panic actually, it is only a call to NonNull::as_ref which is just a pointer dereference.

, and similarly there could be other cases where Deref does some meaningful calculation, even if they don't have to. It would be unprecedented, but perhaps we could require that the types implementing Deref have no interior mutability (Freeze) and require those specifically to be effectively pure

there are ways around that (Box<Cell<T>>) and I don't think we should restrict which types you can use. a common type that has interior mutability and Deref is lazily-initialized statics (lazy_static/LazyCell/LazySync), which I expect to work in format_args if field accesses are permitted at all.

@kennytm
Copy link
Member

kennytm commented May 6, 2024

I just found out that format_args!("{var}") automatically turns all keywords into raw identifiers which leads to rust-lang/rust#115466. So should the 3rd or 4th or 5th assert_eq below work?

#[derive(Debug)]
struct Await {
    r#await: u32,
}

fn main() {
    let r#await = Await { r#await: 6 };
    assert_eq!(format!("{await:?}"), "Await { await: 6 }".to_string()); // <-- currently compiles!
//  assert_eq!(format!("{r#await:?}"), "Await { await: 6 }".to_string()); // <-- error, see #115466
    assert_eq!(format!("{await.await}"), "6".to_string()); // ?
    assert_eq!(format!("{await.r#await}"), "6".to_string()); // ??
    assert_eq!(format!("{r#await.r#await}"), "6".to_string()); // ???
}

@joshtriplett
Copy link
Member Author

I think allowing .await in format arguments is a bad idea because .await commonly has highly non-trivial side effects that are obscured by being in a format string, Deref isn't really a problem because it's already bad style for Deref to have significant side effects. Also, allowing .await just because it kinda looks like field access is imho an overgeneralization, it is nothing like field access.

The rationale for limiting expression types in format strings isn't to limit side effects, it's purely a matter of syntax, and what syntaxes create confusion within a format string. On that basis, allowing anything that uses . seems fine.

If we ever added arbitrary expressions in the future, we'd want some kind of delimiter (e.g. a required set of braces or parentheses). I don't know that we want to do that, but I think that'd be a minimum requirement if we did. The rationale for this RFC is that we can add expressions that use . without needing extra delimiters like those. That's based purely on syntax, not semantics.

@clarfonthey
Copy link
Contributor

The rationale for limiting expression types in format strings isn't to limit side effects, it's purely a matter of syntax, and what syntaxes create confusion within a format string. On that basis, allowing anything that uses . seems fine.

If we ever added arbitrary expressions in the future, we'd want some kind of delimiter (e.g. a required set of braces or parentheses). I don't know that we want to do that, but I think that'd be a minimum requirement if we did. The rationale for this RFC is that we can add expressions that use . without needing extra delimiters like those. That's based purely on syntax, not semantics.

The problem is that this encourages restricting features added to the language to a particular kind of syntax, or adding custom syntax to support certain operations. If await is accessible just as a field, why can't we add in property functions without parentheses like Python or Kotlin? Why are tuples allowed to treat number indices as fields but not arrays? etc.

If the dot access is simple enough, why aren't method calls with no arguments simple enough? etc.

It just feels like opening the door to specific kinds of syntax that "feel" okay adds a bad incentive to make syntax "simpler," at the potential cost of clarity.

@kennytm
Copy link
Member

kennytm commented May 7, 2024

Yeah as I've mentioned in #2795 perhaps it's better to go all-in supporting {(expr)}. You'll need to deal with side-effect due to custom Deref anyway.


BTW #2795's Future possibilities section also suggested using {(expr)} even if expr are only field access:

Future discussion on this topic may also focus on adding interpolation for just a subset of possible expressions, for example dotted.paths. We noted in debate for this RFC that particularly for formatting parameters the existing dollar syntax appears problematic for both parsing and reading, for example {self.x:self.width$.self.precision$}.

The conclusion we came to in the RFC discussion is that adding even just interpolations for dotted.paths will therefore want a new syntax, which we nominally chose as the {(expr)} syntax already suggested in the interpolation alternative section of this RFC.

Using this parentheses syntax, for example, we might one day accept {(self.x):(self.width).(self.precision)} to support dotted.paths and a few other simple expressions. The choice of whether to support an expanded subset, support interpolation of all expressions, or not to add any further complexity to this macro is deferred to the future.

Using {(expr)} can also resolve the raw-identifier issue in #3626 (comment) — you must write {(r#await.r#await)} because the content is parsed as an actual Rust expression, rather than cooked identifiers separated by dots.

@clarfonthey
Copy link
Contributor

clarfonthey commented May 8, 2024

I like (expr) a lot because it makes things substantially easier to read and understand when used to configure formatting parameters, rather than adding in extra $ which could be confused as having semantic meaning.

One concern I'd have with fully supporting (expr) is how rustfmt should handle these expressions nested inside formatting strings. Presumably, you'd just treat it like any other nested expression:

println!("Hello {(match self.pref(person) {
    Pref::Like => "friend",
    Pref::Neutral => "there",
    Pref::Dislike => "you piece of trash"
})}!");

And of course, there could be some confusion with {{expr}}, which would just print the actual code inside the braces, instead of the result. Of course, this would probably be mostly alleviated by syntax highlighting, since I'd expect IDEs to syntax-highlight the code inside the parentheses if it's valid.

Another way would also be to permit spaces around the formatting string itself within the braces, which feels like it'd be reasonable to have:

println!("{greeting} { (person.await) }!")

And we could also make this the default for rustfmt when any expression is used inside the braces. This feels ideal since I'm certain that people would like to be able to use {{ and }} to print actual code and wouldn't want to have to selectively deal with warnings depending on whether the code represents Rust syntax.

@programmerjake
Copy link
Member

println!("Hello {(match self.pref(person) {
    Pref::Like => "friend",
    Pref::Neutral => "there",
    Pref::Dislike => "you piece of trash"
})}!");

Unless Rust added something like JavaScript's template strings (which is probably the best way to get something like this to work), the quotes inside the format string need additional escaping.

@clarfonthey
Copy link
Contributor

println!("Hello {(match self.pref(person) {
    Pref::Like => "friend",
    Pref::Neutral => "there",
    Pref::Dislike => "you piece of trash"
})}!");

Unless Rust added something like JavaScript's template strings (which is probably the best way to get something like this to work), the quotes inside the format string need additional escaping.

Right, you'd probably want to use raw strings here instead, which sidesteps the issue. Although it feels like the parser should be able to recover here and suggest that.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented May 10, 2024 via email

@jdonszelmann
Copy link

A very common thing I do is formatting self.something in custom Display implementations. Simply for that, this would be amazing.

@joshtriplett
Copy link
Member Author

A very common thing I do is formatting self.something in custom Display implementations. Simply for that, this would be amazing.

That's exactly the use case that motivated me to write this in the first place. :)

argument capture:

```rust
println!("{self.value:self.width$.self.precision$}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?
Not sure how it follows from the rest of the RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println!("{0:1$.2$}", self.value, self.width, self.precision);

follows trivially from the definition of argument in the fmt grammar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I have never seen the name$ syntax before, but apparently it's not even a new feature, works since 1.0.

@jkelleyrtp
Copy link

FWIW I use formatting frequently to print the length of lists or do some internal formatting, and it'd be nice to support simple method calls

print!("{list.len()}"
print!("{date.utc()}"

Not sure how that would interact with the RFC but I'd get way more mileage out of that than just field access. Raw exprs would be ideal of course.

This particular method suggestion starts to get hairy when you have arguments to those method calls since those are technically exprs. Not sure how to resolve it without having some weird inconsistency in what can and can't be formatted.

print!("{date.utc(arg)}"

@shepmaster
Copy link
Member

A very common thing I do is formatting self.something in custom Display implementations. Simply for that, this would be amazing.

FWIW, in stable Rust I destructure self. This frequently lets the format macro be smaller (and IMO more readable):

format!("{} jumps over {}", self.source, self.target);

// Current Rust
let Self { source, target } = self;
format!("{source} jumps over {target}");

// Alternate stable version
format!("{source} jumps over {target}", source = self.source, target = self.target);

// RFC proposed
format!("{self.source} jumps over {self.target}");

@shepmaster
Copy link
Member

it'd be nice to support simple method calls

I do not wish to prevent this RFC from moving forward, but I'd also like this as a next step. A common case for me is printing Paths / PathBufs in error values (shown here using SNAFU's syntax):

#[snafu(display("Could not delete the crate file {}", path.display()))]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.