Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self override and multiple values don't interact well together #1374

Closed
Elarnon opened this issue Nov 8, 2018 · 7 comments · Fixed by #2297, #2350 or #2696
Closed

Self override and multiple values don't interact well together #1374

Elarnon opened this issue Nov 8, 2018 · 7 comments · Fixed by #2297, #2350 or #2696
Assignees
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-bug Category: Updating dependencies
Milestone

Comments

@Elarnon
Copy link

Elarnon commented Nov 8, 2018

Affected Version of clap

master and v3-master (did not try crates.io versions)

Bug or Feature Request Summary

Arguments overriding themselves with multiple values have... surprising behavior. Consider the sample code below:

Sample Code or Link to Sample Code

extern crate clap;
use clap::{App, Arg};

fn main() {
    let matches = App::new("MyApp")
        .arg(
            Arg::with_name("input")
                .takes_value(true)
                .long("input")
                .overrides_with("input")
                .min_values(0),
        ).get_matches();

    if let Some(vs) = matches.values_of("input") {
        println!("{:?}", vs.collect::<Vec<_>>());
    }
}

Expected Behavior Summary

One would expect this output, where each new occurrence of the --input argument overrides the preceding one:

$ ./target/debug/example --input a b c --input d
["d"]
$ ./target/debug/example --input a b --input c d
["c", "d"]

Actual Behavior Summary

However on current clap master 33f908d there is no difference between those two invocations:

$ ./target/debug/example --input a b c --input d
["c", "d"]
$ ./target/debug/example --input a b --input c d
["c", "d"]

Interestingly, Clap v3 seems to have changed something, since on v3-master eaa0700 I get this:

$ ./target/debug/example --input a b c --input d
["d"]
$ ./target/debug/example --input a b --input c d
["d"]

which is unfortunately not much better.

Consideration and thoughts

This seem to be caused by the self-override code being written with the use case of arguments with a single value in mind. This also causes other issues, for instance if I change min_values(0) to number_of_values(2) in the above code I get:

$ ./target/debug/example --input a b --input c d
error: Found argument 'd' which wasn't expected, or isn't valid in this context

on master and:

$ ./target/debug/example --input a b --input c d
error: The argument '--input <input> <input>' requires 2 values, but 1 was provided

on v3-master.

The offending code is here:

if let Some(ma) = self.get_mut(aa.name()) {
if ma.vals.len() > 1 {
// swap_remove(0) would be O(1) but does not preserve order, which
// we need
ma.vals.remove(0);
ma.occurs = 1;
} else if !aa.takes_value() && ma.occurs > 1 {
ma.occurs = 1;
}
}

Interestingly, it somehow gets called twice (before parsing the overriding argument, and after parsing it), which is why we need --input a b c --input d to trigger the bug. Doing only --input a b --input c would properly remove the a and b (if you are wondering: yes, --input --input a b only prints ["b"]).

The corresponding code in v3 is here, and it now pops off and only keeps the final argument value (hence the behavior change). It is also now called only once, after the overriding argument has been parsed:

clap/src/parse/parser.rs

Lines 1328 to 1338 in eaa0700

if let Some(ma) = matcher.get_mut(name) {
if ma.occurs < 2 {
continue;
}
ma.occurs = 1;
if !ma.vals.is_empty() {
// This avoids a clone
let mut v = vec![ma.vals.pop().expect(INTERNAL_ERROR_MSG)];
mem::swap(&mut v, &mut ma.vals);
}
}

I am not familiar with the code at all, but from my understanding properly fixing this would require a way of knowing how much arguments were parsed in the current occurrence, in order to only keep that amount. Unfortunately, there doesn't seem to be a way of getting that information currently.

One solution would be to add an occurrences vector to the MatchedArg which keeps track of the position in the vals vector when a new occurrence is encountered, and can be used to properly pop off all values but the ones associated with the last occurrence when an override occurs. This can be extended by also keeping track of the position in the indices vector, since currently self-overrides "leak" through the indices (i.e. the indices of the overridden occurrences are kept), and can be used as a basis to solve #1026. This may also be the opportunity to change the behavior of min_values and max_values in clap 3 (or add a setting) to be per-occurrence instead of sum-of-occurrences (at least for me, that would be more intuitive -- but this would probably require discussion beforehand).

I have a proof-of-concept implementation of the above which I can make into a proper PR if the proposed solution seems reasonable, or I can try and implement another suggestion :)

Elarnon added a commit to Elarnon/clap that referenced this issue Nov 8, 2018
Elarnon added a commit to Elarnon/clap that referenced this issue Nov 8, 2018
…f-override

When an argument gets self-overriden, at most a single value[1] gets
removed from the existing values list.  This is incorrect when the
argument has multiple values per occurrence, and makes for "funny" bugs
such as `--input --input a b` being parsed as `--input b` and
`--input a b c --input d` being parsed as `--input c d`.

This patch fixes the issue by keeping track of how many values were
already present when we started parsing each occurrence, and removing
all the values but the ones for the last occurrence when a self-override
occurs.

[1]: Actually at most two values due to clap handling the overrides
twice, which is a separate bug fixed in v3.
@Elarnon
Copy link
Author

Elarnon commented Nov 8, 2018

I ended up going ahead and creating #1376

Elarnon added a commit to Elarnon/clap that referenced this issue Dec 3, 2018
Elarnon added a commit to Elarnon/clap that referenced this issue Dec 3, 2018
…f-override

When an argument gets self-overriden, at most a single value gets kept
from the existing values list.  This is incorrect when the argument has
multiple values per occurrence, and makes for "funny" bugs such as
`--input a b --input c d` being parsed as `--input d`.

This patch fixes the issue by keeping track of how many values were
already present when we started parsing each occurrence, and removing
all the values but the ones for the last occurrence when a self-override
occurs.
@CreepySkeleton
Copy link
Contributor

Hey @Elarnon , kudos for writing this detailed step-by-step bug report! It's a real pleasure reading well-written explanations in simple fluent language that instantly lands in your brain without a need to parse "what that person meant". Bonus points for the attempt to fix it, I promise we'll get to it eventually.

@CreepySkeleton CreepySkeleton added C: args A-parsing Area: Parser's logic and needs it changed somehow. E-help-wanted Call for participation: Help is requested to fix this issue. C-bug Category: Updating dependencies and removed cc: CreepySkeleton labels Feb 6, 2020
@CreepySkeleton CreepySkeleton added this to the 3.0 milestone Feb 6, 2020
@pksunkara pksunkara added C: asserts and removed E-help-wanted Call for participation: Help is requested to fix this issue. C: asserts labels Apr 9, 2020
ldm0 added a commit to ldm0/clap that referenced this issue Jan 27, 2021
@ldm0 ldm0 mentioned this issue Jan 27, 2021
ldm0 added a commit to ldm0/clap that referenced this issue Jan 27, 2021
@ldm0
Copy link
Member

ldm0 commented Feb 16, 2021

Nope, this behaviour is not correct. As the override_with's documentation states, the override bahviour is disabled when one of the Multiple* enabled. While min_values introduces MultipleValues, the correct behaviour is both of them returning ["a", "b", "c" "d"].

Why currently it works? This is because currently the min_value's MultipleValue is only happens on positional arguments, which is definitely a hack(and why flag with min_value still works, it's because after self overriding, it ca still pass the multiple argument usage checking).

@pksunkara
Copy link
Member

@ldm0 I am recently looking at this part of the code. How do I achieve the following scenarios (assuming number_of_values(3)):

// All occurrences should build to 3 values and they should be additive
$ prog -o 1 2 -o 3
[1, 2, 3]
$ prog -o 1 -o 3
error
// All occurrences should have 3 values and they should override
$ prog -o 1 2 3 -o 4 5 6
[4, 5, 6]
$ prog -o 1 2
error
$ prog -o 1 2 3 -o 4
error

@pksunkara pksunkara reopened this Jun 16, 2021
@ldm0
Copy link
Member

ldm0 commented Aug 8, 2021

@ldm0 I am recently looking at this part of the code. How do I achieve the following scenarios (assuming number_of_values(3)):

// All occurrences should build to 3 values and they should be additive
$ prog -o 1 2 -o 3
[1, 2, 3]
$ prog -o 1 -o 3
error
// All occurrences should have 3 values and they should override
$ prog -o 1 2 3 -o 4 5 6
[4, 5, 6]
$ prog -o 1 2
error
$ prog -o 1 2 3 -o 4
error

Sorry for the long delay, I haven't seen this @ until now. The first one can be easily achieved by:

fn main() {
    let m = clap::App::new("hello")
        .arg(
            clap::Arg::new("foo")
                .long("foo")
                .required(true)
                .multiple_values(true)
                .multiple_occurrences(true)
                .number_of_values(3),
        )
        .get_matches();
    dbg!(m);
}

But we cannot get the second behaviour now. Check: https://docs.rs/clap/3.0.0-beta.2/clap/struct.Arg.html#method.overrides_with . It's disabled when multiple* is on. But I think we can achieve this by adding a new setting now(or making a breaking change to make it work when multiple* is on).

@ldm0 ldm0 self-assigned this Aug 8, 2021
@pksunkara
Copy link
Member

I was thinking the same which is why I opened the issue. Definitely need to be resolved for 3.0

@pksunkara pksunkara removed the W: 3.x label Aug 13, 2021
@ldm0
Copy link
Member

ldm0 commented Aug 14, 2021

I was thinking the same which is why I opened the issue. Definitely need to be resolved for 3.0

@pksunkara I think making a breaking change is better. Previously, self override doesn't working when multiple* is on seems like a technical limitation. Currently we have grouped values, I think the behaviour of self override can be: Working when multiple values is enabled, stop working when multiple occurrences is enabled. This is more intuitive I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-bug Category: Updating dependencies
Projects
None yet
4 participants