Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typed Config Access #5552

Merged
merged 7 commits into from
May 30, 2018
Merged

Typed Config Access #5552

merged 7 commits into from
May 30, 2018

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented May 19, 2018

This introduces a new API for accessing config values using serde to
automatically convert to a destination type. By itself this shouldn't
introduce any behavioral changes (except for some slight wording changes to
error messages). However, it unlocks the ability to use richer data types in
the future (such as profile, or source). Example:

let p: Option<TomlProfile> = config.get("profile.dev")?;

Supports environment variables when fetching structs or maps. Note that it can
support underscores in env var for struct field names, but not maps. So for
example, "opt_level" works, but not "serde_json" (example:
CARGO_PROFILE_DEV_OVERRIDES_serde_OPT_LEVEL). I don't have any ideas for a
workaround (though I feel this is an overuse of env vars).

It supports environment variables for lists. The value in the env var will get
appended to anything in the config. It uses TOML syntax, and currently only
supports strings. Example: CARGO_FOO=['a', 'b']. I did not modify
get_list to avoid changing behavior, but that can easily be changed.

This introduces a new API for accessing config values using serde to
automatically convert to a destination type.  By itself this shouldn't
introduce any behavioral changes (except for some slight wording changes to
error messages).  However, it unlocks the ability to use richer data types in
the future (such as `profile`, or `source`).  Example:

```rust
let p: Option<TomlProfile> = config.get("profile.dev")?;
```

Supports environment variables when fetching structs or maps.  Note that it can
support underscores in env var for struct field names, but not maps.  So for
example, "opt_level" works, but not "serde_json" (example:
`CARGO_PROFILE_DEV_OVERRIDES_serde_OPT_LEVEL`).  I don't have any ideas for a
workaround (though I feel this is an overuse of env vars).

It supports environment variables for lists.  The value in the env var will get
appended to anything in the config.  It uses TOML syntax, and currently only
supports strings.  Example:  `CARGO_FOO=['a', 'b']`.  I did *not* modify
`get_list` to avoid changing behavior, but that can easily be changed.
@rust-highfive
Copy link

r? @matklad

(rust_highfive has picked a reviewer for you, use r? to override)

@ehuss
Copy link
Contributor Author

ehuss commented May 19, 2018

Notes:

  • The API does not return Definition information. This is sometimes useful if you need to do additional validation, and want to provide helpful error messages. If we still want to support this for tables, I suggest changing ConfigValue to use Definition instead of PathBuf, and update get_table to fetch env variables. Otherwise I'd recommend just deprecating/removing get_table.
  • Let me know if get_list should be updated to read env vars (I was avoiding changing too much).
  • I sprinkled some TODO comments with questions I had. If you could look at some of them, I'd appreciate it.
  • I am no serde expert, so I'm uncertain if anything is done the best way.
  • I updated a few places to use the new API just to demonstrate it. I didn't update anything big like removing get_table.
  • I struggled a lot with the error handling, and some of it still seems awkward to me. Please let me know if you see better ways to handle it. (A big issue is that serde's error doesn't play well with failure's error.)
  • Would appreciate feedback for better wording on error messages.
  • I added num_traits to the build, I hope that's OK.
  • Future PRs that update source replacement with the new API should update config.md.
  • I think it would be fairly easy to support additional data types in lists, but I feel like that's something that can be added when needed.
  • I included ConfigRelativePath as a demo. It gets paths relative to where the .cargo/config file lives. I noticed paths treated like this in a few places, but it can be removed it if it doesn't seem useful.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing, very awesome work! It's going to be so much nicer to deserialize rich types in Config from now on.

The API does not return Definition information

I think this is a safe restriction, any further validation I think can be done directly in the Deserialize implementation which would avoid the need for the "source-ness" to leak out of the config module.

I struggled a lot with the error handling, and some of it still seems awkward to me.

At a cursory glance everything looked pretty much in order except for the one location that a failure::Error was stringified, but did you see the error messages coming out as being not-so-great?

I included ConfigRelativePath as a demo. It gets paths relative to where the .cargo/config file lives. I noticed paths treated like this in a few places, but it can be removed it if it doesn't seem useful.

Nice! This seems similar to supporting spans in toml-rs :)

let env_key = key.to_env();
let def = Definition::Environment(env_key.clone());
if let Some(v) = config.env.get(&env_key) {
if !(v.starts_with("[") && v.ends_with("]")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be placed behind a feature gate to start off with?

where
V: de::DeserializeSeed<'de>,
{
// TODO: Is it safe to assume next_value_seed is always called
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so yeah, and we're only using one deserializer here so I don't think we need to attempt to be too strict about this

// CARGO_PROFILE_DEV_OVERRIDES_bar_OPT_LEVEL = 3
let rest = &env_key[env_pattern.len()..];
// rest = bar_OPT_LEVEL
let part = rest.splitn(2, "_").next().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I'm not entirely sure we want to do just yet (but is totally fine behind a feature gate of course). Struct keys like replace_with won't work here I think, which at least hurts the usability of [source] tables :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A feature gate sounds like a good idea. Struct keys like replace_with should work fine (same as opt_level) because structs go through the new_struct code path which knows what the names are and can handle dashes/underscores properly. It's only things like HashMap that don't work properly (like source names).

visitor.visit_bool(v.parse().unwrap())
} else if let Ok(v) = v.parse::<i64>() {
visitor.visit_i64(v)
} else if v.starts_with("[") && v.ends_with("]") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to below I think we'll want to start out with this behind a feature gate

}
}

// TODO: AFAIK, you cannot override Fail::cause (due to specialization), so we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this, couldn't the failure::Error be stored directly in ConfigError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but it is not possible to implement Fail::cause() (because there is a blanket impl of Fail for std::Error you get a conflicting implementation error). I could maybe write a method that creates a failure::Error by chaining ConfigError with the inner Fail error, but it seems like it would be easy to accidentally forget to call that method in the right places (? or into would circumvent it causing the chain to be lost).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I think I see now what's going on. Perhaps the raw failure object could be stored for now though and the rendering could happen in the Display impl? Other than that though I think this is functionally fine in that there's not too many underlying errors coming out here anyway

None => Ok(2),
}
}

// TODO: why is this pub?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm not sure! Probably safe to make private

@@ -541,6 +630,7 @@ impl Config {
!self.frozen && !self.locked
}

// TODO: this was pub for RLS but may not be needed anymore?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too hard I think this'd be good to stick around, I believe cargo-vendor also makes use of this

}
}

pub fn net_retry(&self) -> CargoResult<i64> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be changed to directly return a CargoResult<u32> nowadays

)
fn get_integer<T>(&self, key: &ConfigKey) -> Result<Option<Value<T>>, ConfigError>
where
T: FromStr + num_traits::NumCast + num_traits::Bounded + num_traits::Zero + fmt::Display,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually a little surprised this is necessary, shouldn't Serde be doing all this under the hood? In that if you deserialize a -1 during a visit_u32, I think serde returns a canned error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, you're right. The error messages aren't quite as nice, but are good enough, and probably not worth the extra complexity.

@alexcrichton
Copy link
Member

r? @alexcrichton

@ehuss
Copy link
Contributor Author

ehuss commented May 23, 2018

At a cursory glance everything looked pretty much in order except for the one location that a failure::Error was stringified, but did you see the error messages coming out as being not-so-great?

It's not so much the text as all the different conversions that are sprinkled all over. Like is .map_err(|e| e.into()) the simplest way? There is a call format!("{}", e) because I couldn't figure out how to call to_string() since there isn't a specific FromStr::Err type. And of course the From impl for failure is a little odd.

@alexcrichton
Copy link
Member

One thing I'd ideally like to see is more config moved over to this system, but other than that this seems mostly ready to go to me?

Like is .map_err(|e| e.into()) the simplest way?

Ok(foo()?) is another option, but it's kinda half and half either way

@matklad
Copy link
Member

matklad commented May 24, 2018

An interesting next step here is to use a typed key pattern:

pub struct Key<T> {
    pub name: &'static str,
    pub marker: PhantomData<T>,
}

/// Docstring
pub const JOBS: Key<u32> = Key { name: "build.jobs", marker: PhantomData};


fn foo(config: &Config) {
     let jobs = config.get(JOBS); // Note: no turbo fish necessary
}

The main benefit of this is that config options are turned into const values, like JOBS, which make it possible to gather all various configs values in one place, and write docs for them. It also helps to avoid typos in key's names, and removes the requirement for a turbo-fish at a call site. Here's an example of this approach: exonum/exonum#417.

We might or might not want to use this pattern in Cargo :)

@ehuss
Copy link
Contributor Author

ehuss commented May 25, 2018

I updated the error type to wrap failure::Error. Alternatively it could wrap failure::Fail, I'm uncertain which would be better.

The typed key pattern looks interesting! Let me know if you want to do that now, or maybe extend it later. I imagine this functionality will need some tweaking once it starts getting used for more complex types, so I expect it will evolve a little over time.

@alexcrichton
Copy link
Member

The typed keys sounds like a cool idea! I'd be ok though leaving that to a follow-up PR

@@ -847,21 +847,21 @@ impl fmt::Display for ConfigKey {
/// Internal error for serde errors.
#[derive(Debug)]
pub struct ConfigError {
message: String,
error: failure::Error,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have pub use failure::Error as CargoError somewhere. I don't know if we want to use failure::Error or CargoError, but currently we use CargoError in the majority of the code base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda assumed that was there just for being compatible with the old CargoError which used to be its own type. @alexcrichton was it intended to continue to maintain that abstraction? The use of failure::Error is being used in about 4 other places, so it is leaking a little.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that was actually a holdover to make the "migrate to failure" patch smaller than it otherwise would be. Nowadays we should just switch to failure::Error everywhere I think, the transition to failure just isn't fully done yet

@ehuss
Copy link
Contributor Author

ehuss commented May 30, 2018

I think this is ready, I'm not sure if I missed anything. Are there any other changes you'd like? I was intending for updates to the users of the config to be done in follow-up PRs.

@alexcrichton
Copy link
Member

@bors: r+

Oh oops sorry! I was gonna wait for more bits and pieces to move over but if you'd like to land this that also sounds good to me!

@bors
Copy link
Contributor

bors commented May 30, 2018

📌 Commit b3db164 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented May 30, 2018

⌛ Testing commit b3db164 with merge 5fc6ead...

bors added a commit that referenced this pull request May 30, 2018
Typed Config Access

This introduces a new API for accessing config values using serde to
automatically convert to a destination type.  By itself this shouldn't
introduce any behavioral changes (except for some slight wording changes to
error messages).  However, it unlocks the ability to use richer data types in
the future (such as `profile`, or `source`).  Example:

```rust
let p: Option<TomlProfile> = config.get("profile.dev")?;
```

Supports environment variables when fetching structs or maps.  Note that it can
support underscores in env var for struct field names, but not maps.  So for
example, "opt_level" works, but not "serde_json" (example:
`CARGO_PROFILE_DEV_OVERRIDES_serde_OPT_LEVEL`).  I don't have any ideas for a
workaround (though I feel this is an overuse of env vars).

It supports environment variables for lists.  The value in the env var will get
appended to anything in the config.  It uses TOML syntax, and currently only
supports strings.  Example:  `CARGO_FOO=['a', 'b']`.  I did *not* modify
`get_list` to avoid changing behavior, but that can easily be changed.
@bors
Copy link
Contributor

bors commented May 30, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 5fc6ead to master...

@bors bors merged commit b3db164 into rust-lang:master May 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants