refactor!: parsing, revisit short option groups, add support for combined short and value #75

shadowspawn · 2022-03-04T00:56:53Z

This incorporates ideas from #64, #68, and #69. Background on short options in #2.

Refactor parsing to use independent blocks of code, rather than nested cascading context. This makes it easier to reason about the behaviour.
Split out small pieces of logic to named routines to improve readability, and allow extra documentation and examples without cluttering the parsing. (Thanks to @aaronccasanova for inspiration.)
Existing tests untouched to make it clear that the tested functionality has not changed.
Be more explicit about short option group expansion, and ready to throw error in strict mode for string option in the middle of the argument. (See Is strict actually a goal? #11 and feat: Add strict mode to parser #74.)
Add support for short option combined with value (without intervening =). This is what Commander and Open Group Utility Conventions do, but is not what Yargs does. I don't want to block PR on this and happy to comment it out for further discussion if needed. (I have found some interesting variations in the wild.) [Edit: see also Syntax for combined short option and value #78]
Add support for multiple unit tests files. Expand tests from 33 to 113, but many for internal routines rather than testing exposed API.
Added .editorconfig file, mainly for my own convenience!

- remove now unused experimental code - split isLongOption - rename routines and files

ljharb

We need to add “exports”, if we want the utils not to be part of the public api.

package.json

utils.js

Co-authored-by: Jordan Harband <ljharb@gmail.com>

shadowspawn · 2022-03-04T03:31:57Z

I was neglecting the exports from the shim point of view, but makes good sense from the package point of view. Done. Thanks.

shadowspawn · 2022-03-04T04:33:09Z

Added tests for duck-typing of misused options, which came out of discussion in Behaviour for zero config --foo=a ? #24 and Behaviour for withValue --foo followed by --bar ? #25.

utils.js

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

ljharb

Do note that adding "exports" is a breaking change, which pre-1.0 increments the second number :-)

package.json

utils.js

Co-authored-by: Jordan Harband <ljharb@gmail.com>

aaronccasanova

Awesome work @shadowspawn! Lot to unpack here and didn't get through everything. Submitting this partial review to provide some initial feedback and introduce some potential discussion points/opportunities for clarification. Note: no comments are blocking, so feel free to resolve any of the callouts 👍

index.js

utils.js

index.js

test/dash.js

test/is-lone-long-option.js

test/is-lone-short-option.js

aaronccasanova · 2022-03-05T04:37:55Z

test/short-option-combined-with-value.js

+});
+
+test('when combine string short with value like short option then parsed as value', (t) => {
+  const passedArgs = ['-a-b'];


question (non-blocking): I'm not too familiar with this pattern (e.g. lack of ' ' or '=' delimiters between short option values). Is this common for short options/backed by any guidelines?

These are from the hard-core Open Group guidelines, which does not even mention long options. But I find useful to understand the origins and underpinnings of many command-line parsing details.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_01

(Emphasis on last sentence added by me.)

12.1 Utility Argument Syntax
...
2. Option-arguments are shown separated from their options by characters, except when the option-argument is enclosed in the '[' and ']' notation to indicate that it is optional. This reflects the situation in which an optional option-argument (if present) is included within the same argument string as the option; for a mandatory option-argument, it is the next argument. The Utility Syntax Guidelines in Utility Syntax Guidelines require that the option be a separate argument from its option-argument and that option-arguments not be optional, but there are some exceptions in POSIX.1-2017 to ensure continued operation of historical applications:

a. If the SYNOPSIS of a standard utility shows an option with a mandatory option-argument (as with [ -c option_argument] in the example), a conforming application shall use separate arguments for that option and its option-argument. However, a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening characters.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html

The description has been written to make it clear that getopt(), like the getopts utility, deals with option-arguments whether separated from the option by characters or not. Note that the requirements on getopt() and getopts are more stringent than the Utility Syntax Guidelines.

https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap12.html

SYNOPSIS Shows: -a arg
Conforming application uses: -a arg
System supports: -a arg and -aarg
Non-conforming applications may use: -aarg

Very helpful reference! And reading further answered another question I had regarding what characters are allowed in option arguments:

Guideline 7 allows any string to be an option-argument; an option-argument can begin with any character, can be - or --, and can be an empty string. For example, the commands pr -h -, pr -h --, pr -h -d, pr -h +2, and pr -h " contain the option-arguments -, --, -d, +2, and an empty string, respectively. Conversely, the command pr -h -- -d treats -d as an option, not as an argument, because the -- is an option-argument here, not a delimiter.

Almost makes me wonder if we should always capture the nextArg if options[longOption].type === 'string'. Perhaps we shouldn't open that can of worms 😅 I'm quite satisfied with the current behavior.

Yes indeed, glad you mentioned it! I have actually been thinking about this and doing some research this week. I opened #76 partly to provide a context for revisiting whether options are "greedy", and was leaning towards opening a new issue about that. Now you have raised it, I definitely will!

Before I was clear on the implications I had opened #25 which covers this ground too, but much of the discussion there is about what to store in strict:false mode if the value is missing, and not deep coverage of greedy vs non-greedy.

I'm ok with shipping either way, but would like people to be clearer on where the behaviour stands in comparison with other implementations and why.

would like people to be clearer on where the behavior stands in comparison with other implementations and why.

Yes! I think it would be super helpful to have a clear understanding/consensus on what standards and implementations we follow and/or influence the parseArgs API.

aaronccasanova · 2022-03-05T05:19:21Z

test/store-user-intent.js

+test('when use boolean long option used as string then result as if string', (t) => {
+  const passedArgs = ['--bool=OOPS'];
+  const stringOptions = { bool: { short: 'b', type: 'string' } };
+  const booleanOptions = { bool: { short: 'b', type: 'boolean' } };
+
+  const stringConfigResult = parseArgs({ args: passedArgs, options: stringOptions, strict: false });
+  const booleanConfigResult = parseArgs({ args: passedArgs, options: booleanOptions, strict: false });


thought (non-blocking): I'm surprised by the results of the boolean assertion. If I the author explicitly say --bool is type: 'boolean', I don't understand why we would capture values from incorrect usage by users.

// node user-intent.js --bool=OOPS --more-flags... const result = parseArgs({ options: { bool: { type: 'boolean' } } }) for (const [longOption, optionValue] of Object.entries(result.options)) { if (typeof optionValue === 'string') // process string options if (typeof optionValue === 'boolean') // process boolean options }

☝️ This is an incredibly contrived example, but if my intent as the author was to do one thing with strings and another with booleans it would break due to incorrect argument usage.

Note: This makes total sense to me if there was no configuration options, but if provided, I would think you're saying this is the argument contract my program accepts and relies on internally. If your program allows boolean-like options to accept values, you would simply set the type to string and account for that in your implementation. Thoughts?

100% if i configure an arg to be a boolean, it should throw if it's not one.

We have strict:false and strict:true.

In strict:true mode, the "wrong" usage will throw (well, hopefully, soon!). Note, these tests use strict:false in anticipation of that arriving.

In strict:false mode we are doing best effort parsing no matter the input. We don't want to throw away information which might be needed for the author to make use of. We hope they pay some attention...

I understood the test was for strict: false and that we would like to capture user intent without throwing away information. However, that seems more applicable to the zero-config behavior to me. If I configure a CLI to accept specific input, I should be confident the args parser will return the type I defined.

TS Playground
I took a shot at implementing some TypeScript definitions for this and it's not feeling very intuitive imo.

Notice bar and baz are configured as type: boolean, yet the results can be a string.

Again this is non-blocking, but think this could lead to less than ideal implementations that require checking if type: 'boolean' options are actually booleans. As well as convoluted TypeScript definitions that have different parse args results based on the strict mode config. Honestly, this isn't a major issue and really just poking at possible holes in the implementation as we get closer to the Node release.

Regarding TypeScript, you said:

// Notice bar and baz are configured as type: boolean, yet the results can be a string.

The converse should apply to foo too, it is configured as type: string, yet the result can be true.

What behavior do you have in mind for mis-using configured types? (Throwing, despite being strict:false?)

Not throw, I was thinking we just return a boolean regardless if it's misconfigured.

// node misconfigured.js --foo=bar const parsed = parseArgs({ strict: false, options: { foo: { type: 'boolean' } } }) parsed.values.foo // true

The input is already flawed, so might as well return the type I expect (as opposed to a completely different type string). Missing --foo would still be undefined. Thoughts?

and for interest Minimist returns empty string for a configured "string" option which is misused without a value

Not against this either tbh, but don't have strong opinions. I'm more "concerned" about parseArgs returning different types than the author defined.

A high level question is why do you want to use strict:false mode at all?

I will likely always use strict:true and haven't put much thought into what use cases folks have for strict:false. Would be curious to hear your thoughts here!

The converse should apply to foo too, it is configured as type: string, yet the result can be true.

Oh really!? Maybe I do prefer the Minimist approach. i.e. setting an empty string for a misconfigured type: string

It would be great if authors didn't have to check the typeof before using prototype methods for a defined option type: e.g.

// node misconfigured.js --foo const parsed = parseArgs({ strict: false, options: { foo: { type: 'string' } } }) if (parsed.values.foo?.includes('substring')) // Crashes the program

It would be great if authors didn't have to check the typeof before using prototype methods for a defined option type

Without a throw, coercing the type means the program does not crash, but does not ensure safe and intended behaviour. In that sense crashing is arguably better behaviour than silently doing the wrong thing. Two examples:

// explode is configured as a boolean (but user mis-used) $ node detonate.js --explode=false

// no-explode is configured as a boolean (but user typo) $ node detonate.js --no-expode foo bar

Great examples! Definitely demonstrates the severity and potential impact of misconfigured options. So I'm realizing strict:false is probably more suited for bring your own validation and making authors responsible for robustness. Whereas, strict:true allows authors to rely on parseArgs for option validation and throwing errors for misuse.

shadowspawn · 2022-03-05T07:22:46Z

Thanks @aaronccasanova , you have picked up several errata in the comments and some good suggestions and questions.

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

shadowspawn · 2022-03-06T00:57:07Z

Thanks for reviews and comments @ljharb and @Eomm and @aaronccasanova . I'll wait for @bcoe to at least have a quick look before I consider merging.

Further comments welcome from any gentle readers.

shadowspawn · 2022-03-12T07:26:06Z

Waiting no longer. 😄

shadowspawn added 19 commits March 2, 2022 12:58

Support multiple unit test files

2734d65

Add isPossibleOptionValue

96f1447

Add .editorconfig so editor knows about lint settings

9dfd7b6

Add isLoneShortOption

638e07c

Add isLongOption

c8c9e0c

Add separate dash tests

04d4d95

Update signature for running new tests to arrow functions

512afe5

isShortOptionGroup

83f0e02

Merge branch 'main' into feature/refactor-parse

686cbe3

Update to new calling signature

bc45095

Add findLongOptionForShort

0a9c04c

Start updating main parsing loop, and rework some utils.

042d957

- remove now unused experimental code - split isLongOption - rename routines and files

Switch loop to shift

8f85ecd

Add isShortOptionAndValue

cb93bfa

Form expanded, clearer

4683053

Fixes

aeff889

Improve comments

0c399bc

New tests for short option group (and fixes)

db3e06e

Add tests for combining short and value

3a7ea3c

This was referenced Mar 4, 2022

feat: short with value in same arg #69

Closed

Refactor parsing for readability and future expansion, and tighten short option group parsing #68

Closed

ljharb requested changes Mar 4, 2022

View reviewed changes

package.json Outdated Show resolved Hide resolved

utils.js Outdated Show resolved Hide resolved

utils.js Outdated Show resolved Hide resolved

shadowspawn and others added 5 commits March 4, 2022 16:21

Update package.json

bb22e5a

Co-authored-by: Jordan Harband <ljharb@gmail.com>

Update utils.js

eac5ecf

Co-authored-by: Jordan Harband <ljharb@gmail.com>

Update utils.js

419060c

Co-authored-by: Jordan Harband <ljharb@gmail.com>

Add import

11bd06f

Add exports to keep utils private

1c1d047

shadowspawn added 2 commits March 4, 2022 16:48

AAA: Arrange, Act, Assert

3fcdcb3

Add tests for failure duck typing

cf48248

shadowspawn requested a review from ljharb March 4, 2022 04:34

Add another dash example

d2a1bc4

shadowspawn mentioned this pull request Mar 4, 2022

feat: Add strict mode to parser #74

Merged

Make test for undefined more robust

22fd538

aaronccasanova reviewed Mar 4, 2022

View reviewed changes

utils.js Outdated Show resolved Hide resolved

Update utils.js

a409102

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

ljharb approved these changes Mar 4, 2022

View reviewed changes

package.json Outdated Show resolved Hide resolved

utils.js Outdated Show resolved Hide resolved

utils.js Outdated Show resolved Hide resolved

shadowspawn and others added 3 commits March 5, 2022 14:11

Update utils.js

1bf4cfb

Co-authored-by: Jordan Harband <ljharb@gmail.com>

Update utils.js

228056b

Co-authored-by: Jordan Harband <ljharb@gmail.com>

Update package.json

bc6dae7

Co-authored-by: Jordan Harband <ljharb@gmail.com>

shadowspawn changed the title ~~Refactor parsing, revisit short option groups, add support for combined short and value~~ refactor!: parsing, revisit short option groups, add support for combined short and value Mar 5, 2022

aaronccasanova reviewed Mar 5, 2022

View reviewed changes

shadowspawn and others added 5 commits March 5, 2022 20:29

Update index.js

6153fd2

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

Comment improvements

eff783e

Update index.js

90f9864

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

Update test/dash.js

6013dc4

Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>

Expand test description per feedback

e4b4f28

Eomm approved these changes Mar 5, 2022

View reviewed changes

shadowspawn mentioned this pull request Mar 9, 2022

Make "type: string" options greedy? #77

Closed

shadowspawn merged commit a92600f into pkgjs:main Mar 12, 2022

github-actions bot mentioned this pull request Mar 12, 2022

chore(main): release 0.4.0 #73

Merged

shadowspawn deleted the feature/refactor-parse branch June 5, 2022 03:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor!: parsing, revisit short option groups, add support for combined short and value #75

refactor!: parsing, revisit short option groups, add support for combined short and value #75

shadowspawn commented Mar 4, 2022 •

edited

Loading

ljharb left a comment

shadowspawn commented Mar 4, 2022

shadowspawn commented Mar 4, 2022 •

edited

Loading

ljharb left a comment

aaronccasanova left a comment

aaronccasanova Mar 5, 2022

shadowspawn Mar 5, 2022

aaronccasanova Mar 8, 2022

shadowspawn Mar 8, 2022

aaronccasanova Mar 13, 2022

aaronccasanova Mar 5, 2022 •

edited

Loading

ljharb Mar 5, 2022

shadowspawn Mar 5, 2022

aaronccasanova Mar 8, 2022

aaronccasanova Mar 8, 2022

shadowspawn Mar 8, 2022

aaronccasanova Mar 8, 2022

aaronccasanova Mar 8, 2022

shadowspawn Mar 8, 2022

aaronccasanova Mar 8, 2022

shadowspawn commented Mar 5, 2022

shadowspawn commented Mar 6, 2022

shadowspawn commented Mar 12, 2022

refactor!: parsing, revisit short option groups, add support for combined short and value #75

refactor!: parsing, revisit short option groups, add support for combined short and value #75

Conversation

shadowspawn commented Mar 4, 2022 • edited Loading

ljharb left a comment

Choose a reason for hiding this comment

shadowspawn commented Mar 4, 2022

shadowspawn commented Mar 4, 2022 • edited Loading

ljharb left a comment

Choose a reason for hiding this comment

aaronccasanova left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronccasanova Mar 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shadowspawn commented Mar 5, 2022

shadowspawn commented Mar 6, 2022

shadowspawn commented Mar 12, 2022

shadowspawn commented Mar 4, 2022 •

edited

Loading

shadowspawn commented Mar 4, 2022 •

edited

Loading

aaronccasanova Mar 5, 2022 •

edited

Loading