Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JCommander-style parameter files #551

Closed
triceo opened this issue Nov 19, 2018 · 28 comments
Closed

Support JCommander-style parameter files #551

triceo opened this issue Nov 19, 2018 · 28 comments

Comments

@triceo
Copy link
Contributor

triceo commented Nov 19, 2018

It is possible that this is a feature, but I did not find it in the documentation. See my Github repo for a complete reproducer.

Steps to reproduce

Create an argument file argFile such as this:

-f "C:\Users\me\somefile.txt"

Run Picocli like this:

CommandLine.run(..., "@argFile");

Observe that the value of the String-typed Option has been changed to C:UsersmesomeFile.txt.

Additional information

  • When the value is not quoted, everything works fine. This suggests a problem with not escaping the Windows backslash character. However, the quotes are used in order not to have to escape whitespace characters and therefore this workaround creates a different problem - either I escape the backslashes or the whitespace.
  • When the value is kept quoted but passed directly on the command line (as opposed to the argument file), the problem no longer shows up. Unfortunately, the code will include the quotes in the resulting value. There is a second test included to showcase that problem.
@remkop
Copy link
Owner

remkop commented Nov 19, 2018

Thanks for the report! I’ll look into this as soon as I can.

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

I have reproduced the problem. This seems to be the behaviour of java.io.StreamTokenizer that is used internally: it treats quoted Strings as if they are String literals in Java source code, where \ characters need to be escaped with another \.

This is not great. Looking into solutions...

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

About the second issue you mention, that " quote values are preserved when specified on the command line: this is by design but is configurable.

You can switch this off with commandLine.setTrimQuotes(true).

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

Sorry I meant commandLine.setTrimQuotes(true).

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

Looking at the specification for @-files for the java command, it seems that backslashes in quoted strings always need to be escaped and quotes in @-files are always removed. This is consistent with what picocli does...

From the spec:

For example, it is possible to have a path with a space, such as c:\Program Files that can be specified as either "c:\\Program Files" or, to avoid an escape, c:\Program" "Files.

However, this documentation may be out-of-date. I will test with the Tokenizer class in the above sun/tools/javac/main/CommandLine.java link.

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

Hmm... I tested with the Tokenizer class used in Java 11 but it behaves the same as picocli: the 2nd argument in your example is parsed as C:UsersmesomeFile.txt.

I think it makes sense for picocli's @-file behaviour to be consistent with Java's @-file behaviour, what do you think?

@triceo
Copy link
Contributor Author

triceo commented Nov 20, 2018

@remkop It makes sense to be consistent.

That said, it is an incredible pain for the users who have to escape everything. Especially the beginner users of my app will be confused by this and it will have to be carefully explained.

(A bit of context: I migrated my codebase from JCommander to Picocli and JCommander handled this particular situation without problems.)

@remkop
Copy link
Owner

remkop commented Nov 20, 2018

Thanks for the background. Let me check how JCommander does it.

@remkop
Copy link
Owner

remkop commented Nov 21, 2018

Looks like JCommander requires every argument to be on a separate line in the @argument-file.
That simplifies parsing: it just ignores quotes altogether and arguments can have embedded whitespace (other than line breaks).

I can add a parser configuration to allow application authors to switch between the current spec and the JCommander behaviour. The default will be the current behaviour (both to avoid breaking existing applications and to be consistent with Java's @-file behaviour). I am thinking to add a method for this similar to CommandLine::setExpandAtFiles(boolean) and ParserSpec::expandAtFiles(boolean). (Also, should there a system property for end users to control this?)

The new behaviour will do the following:

  • lines starting with the # character are comments and are ignored
  • every line in the @-file (other than lines starting with #) is passed to the parser as a separate command line argument
  • quotes in the @-file have no special meaning and remain part of the command line argument passed to the parser
  • embedded whitespace does not need to be escaped
  • backslashes do not need to be escaped
  • command line arguments cannot contain newlines

What would be a good name for the configuration with this behaviour?

@triceo
Copy link
Contributor Author

triceo commented Nov 21, 2018

@remkop When it comes to naming, I keep thinking in terms of "simplified" or "per line". Neither of those say much about what's actually going to happen, though. Perhaps we could be speaking of JCommander emulation - but that still doesn't explicitly say what it does.

How about "raw mode"? No escaping means you get exactly what you write.

@triceo
Copy link
Contributor Author

triceo commented Nov 21, 2018

Literal mode?

@remkop
Copy link
Owner

remkop commented Nov 21, 2018

These would become methods on CommandLine (prefixed with set) and on ParserSpec (without prefix). I think the method should have the word "AtFile" in it. For reference, we currently have ParserSpec::atFileCommentChar(Character) and ParserSpec::expandAtFiles(boolean) and their CommandLine::setXXX equivalents.

So potential candidates so far:

  • useSimpliedAtFiles
  • useSimpliedAtFileSyntax
  • usePerLineAtFile
  • useRawModeAtFile (or atFilesAreRaw)
  • useLiteralModeAtFile (or atFilesAreLiteral)

@triceo
Copy link
Contributor Author

triceo commented Nov 21, 2018

I like either atFilesAreLiteral or useSimplifiedAtFiles.

@remkop
Copy link
Owner

remkop commented Nov 21, 2018

I like useSimplifiedAtFiles too.

Do you feel like helping with this? The change itself is fairly straightforward (see diff below), but there's a bit of work creating the getters/setters for useSimplifiedAtFiles on CommandLine and ParserSpec (similar to the way it's done for expandAtFiles), and we need to add tests. (For reference, see CommandLineTest testAtFileExpandedXxx and testSetAtFileCommentChar_Before/AfterSubcommandsAdded.)
Also, we should update the user manual.

Will you be able to provide a PR for this?

Index: src/main/java/picocli/CommandLine.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/main/java/picocli/CommandLine.java	(revision )
+++ src/main/java/picocli/CommandLine.java	(revision )
@@ -6641,17 +6641,26 @@
             LineNumberReader reader = null;
             try {
                 reader = new LineNumberReader(new FileReader(file));
-                StreamTokenizer tok = new StreamTokenizer(reader);
-                tok.resetSyntax();
-                tok.wordChars(' ', 255);
-                tok.whitespaceChars(0, ' ');
-                tok.quoteChar('"');
-                tok.quoteChar('\'');
-                if (commandSpec.parser().atFileCommentChar() != null) {
-                    tok.commentChar(commandSpec.parser().atFileCommentChar());
-                }
-                while (tok.nextToken() != StreamTokenizer.TT_EOF) {
-                    addOrExpand(tok.sval, result, visited);
+                if (commandSpec.parser.useSimplifiedAtFiles()) {
+                    String token;
+                    while ((token = reader.readLine()) != null) {
+                        if (!token.startsWith("#")) {
+                            addOrExpand(token, result, visited);
+                        }
+                    }
+                } else {
+                    StreamTokenizer tok = new StreamTokenizer(reader);
+                    tok.resetSyntax();
+                    tok.wordChars(' ', 255);
+                    tok.whitespaceChars(0, ' ');
+                    tok.quoteChar('"');
+                    tok.quoteChar('\'');
+                    if (commandSpec.parser().atFileCommentChar() != null) {
+                        tok.commentChar(commandSpec.parser().atFileCommentChar());
+                    }
+                    while (tok.nextToken() != StreamTokenizer.TT_EOF) {
+                        addOrExpand(tok.sval, result, visited);
+                    }
                 }
             } catch (Exception ex) {
                 throw new InitializationException("Could not read argument file @" + fileName, ex);

@triceo
Copy link
Contributor Author

triceo commented Nov 21, 2018

@remkop I'll take a look when I can find a little time.

@remkop
Copy link
Owner

remkop commented Nov 21, 2018

Would be great!

@remkop
Copy link
Owner

remkop commented Nov 26, 2018

I am considering doing a minimal implementation for the above change that can be controlled only with a system property -Dpicocli.useSimplifiedAtFiles.

Not introducing new API reduces the testing and documentation effort. I also imagine that end users would like some way to control this, rather than CLI application authors.

@triceo
Copy link
Contributor Author

triceo commented Nov 26, 2018

@remkop I am not sure there. Shouldn't the app developers be in control of how their app should behave and what kinds of inputs it will accept? In my opinion, leaving it to the user of the app opens the developer/distributor of the app to have to support something that they did not intend to.

@remkop
Copy link
Owner

remkop commented Nov 26, 2018

Generally I would agree, but in this case it is about whether the user specifies an argument file where each parameter is on a separate line, or an argument file where a line may have multiple parameters, and quotes will be stripped off.

This happens transparently to the application; the application just sees the resulting arguments after picocli did the pre-processing of the argument file.

It is, however, not transparent to the user. They need to know what they are doing. If they set -Dpicocli.useSimplifiedAtFiles they cannot provide an argument file with multiple args on a single line (or they will get unexpected results).

That said, if the application needs to take control, they can: the application can just explicitly call System.setProperty("useSimplifiedAtFiles", "true") (or "false") before parsing the command line arguments. The application would of course need to explain what kind of argument files it supports in its documentation, whereas otherwise they could point to the picocli documentation.

@triceo
Copy link
Contributor Author

triceo commented Nov 26, 2018

Ok, in my case, I will be enforcing it through the property in my application. Anyway - if you're implementing this, do you still need anything from me? I was going to take a look at the changes you suggested towards the end of this week.

@remkop
Copy link
Owner

remkop commented Nov 26, 2018

You may still be there faster than me... Very busy with day job recently.
Otherwise, do you think you can contribute some unit tests?

@triceo
Copy link
Contributor Author

triceo commented Nov 26, 2018

@remkop OK, let's make a deal - you introduce the code, I'll add the tests. :-)

@remkop
Copy link
Owner

remkop commented Nov 26, 2018

Sounds good! :-)

remkop added a commit that referenced this issue Nov 28, 2018
@remkop
Copy link
Owner

remkop commented Nov 28, 2018

This is now implemented on master: if system property picocli.useSimplifiedAtFiles is defined, either without a value or with a value of (case-insensitive) "true", then each line in the argument file is interpreted as a separate argument.

I will update the documentation.

Can you provide unit tests?

@triceo
Copy link
Contributor Author

triceo commented Nov 28, 2018

@remkop Sure, I'll take a look this weekend. Any idea for when we could have a release?

@remkop
Copy link
Owner

remkop commented Nov 28, 2018

I was thinking this weekend.

@remkop remkop changed the title Windows file path mangled when quoted Support JCommander-style parameter files Nov 29, 2018
@triceo
Copy link
Contributor Author

triceo commented Dec 1, 2018

I believe this is closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants