Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying typechecking of optional values #271

Closed
mlin opened this issue Nov 26, 2018 · 5 comments
Closed

Clarifying typechecking of optional values #271

mlin opened this issue Nov 26, 2018 · 5 comments

Comments

@mlin
Copy link
Member

mlin commented Nov 26, 2018

Spec gray area:

If I have an optional declaration T? x, and an expression refers to x in some context where (non-optional) T is expected, when should this be statically rejected as a type mismatch error, vs. considered valid albeit liable to fail at runtime?

The spec right now says x "can only be used in calls or functions that accept optional values" but this isn't quite clear to the above.

My "first principles" instinct would be that T? x should statically fail to unify with (non-optional) T absent some explicit coercion, but:

  • It's expressly valid and desirable for interpolations in task commands, as described in "Prepending a String to an Optional Parameter." Also relevant Prepending to an optional hack #227
  • If one writes another declaration T y = x what should happen when?
  • I infer from some examples in the wild that Cromwell accepts the pattern like T y = if defined(x) then EXPR_WITH_x else SOME_DEFAULT (ex1, ex2).
    • In those examples, EXPR_WITH_x is x * 1000 and size(x) respectively.
    • Do we check the optional quantifier statically at all?
    • Alternatively, one could imagine special-case typechecking logic for the if defined(x) ... case.
  • If one specifies x to a required call input of type T what should happen when? (ex)

Appreciate if someone from Cromwell team could clarify how it approaches this (and of course comment on what would be ideal for the spec). Thanks!

@geoffjentry
Copy link
Member

This feels an awful lot like a discussion @yfarjoun raised in #263 - noting that here, if I didn't just overaggressively pattern match it'd be useful to unify the conversations.

Also tagging @cjllanwarne to take a first stab at this from the cromwellian perspective

@cjllanwarne
Copy link
Contributor

This does indeed come up in discussions a lot!

I've often found that what seemed like "special cases to make things easier" has instead lead to confusion and a harder to understand language. And there are a lot of special cases around optionals (as you mentioned) that I regularly find more confusing than helpful.

So my gut feeling is to be stricter on enforcing "you can't use optionals without unpacking them first" as long as we allow people to unpack things easily if they need to. Unfortunately the "current" way of unpacking a Int? x into Int is rather ugly: select_first([x]).

I'd be open to either a structural or expression based way of unpacking. I think @patmagee 's suggestion against my structural #187 was very good and far less intrusive - he suggested that we could have a simple select(x) or get(x) function to turn declarations likeInt? x into Int or else throw an error. I think this would free us up to be a lot more aggressive with type-checking of optionals.

Re "task commands", I think a recent PR (still awaiting implementation) was involved in tidying up those command sections to only allow simple expressions rather than their own entire complex syntax. I wonder if that addressed optionals.

Re the final example about optional inputs - I think that this is now tidied up in WDL 1.0. In WDL 1.0 the example would be able to indicate "the input has a default so might not be supplied, but the type of the value is not optional" like this:

workflow calulateDNAContamination {
  input {
    String docker_image = "..."
  }

@mlin
Copy link
Member Author

mlin commented Nov 28, 2018

@cjllanwarne Thanks!! @orodeh just showed me some experiments with womtool-36 that helped to reconcile what I was seeing in some public WDLs. Namely, suppose we have a workflow input Int? x, and attempt the following three constructs,

  1. Int y = x
  2. Int z = x+1
  3. call some_task { input: a_required_int = x }

We found that

  1. is statically rejected, whether the WDL is draft-2 or 1.0
  2. is accepted, for both draft-2 and 1.0
  3. is accepted for draft-2 but rejected for 1.0

Before trying this out, I hadn't mentally distinguished 1 and 2, while 3 explains the draft-2 TopMed workflow that further confused me. The if defined(x) then EXPR_WITH_x else SOME_DEFAULT examples I saw are not specific to that construct but actually fall under case 2.

Does this seem accurate as far as what the code does right now?

I too support a more sugary way to coerce away the optional as you discuss, although I couldn't resist playing with the special case for 'if defined', I'll be happy to roll that back :)

Going forward, I feel we should tighten up case 2 although I do recognize that "Prepending a String to an Optional Parameter" is a convenient and desirable feature.

@mlin
Copy link
Member Author

mlin commented Nov 28, 2018

I also just want to import here @yfarjoun's discussion question on #263 of how the [in]equality operators should behave when applied to one non-optional, and one optional operand. Either way could be fine IMHO, but it needs to be decided.

What about

Int? maybe_five = 5
Int five = 5
Boolean compare_five_to_maybe_five = maybe_five == five

Should this validate?
I think it should and that the answer should be True.

mlin added a commit to chanzuckerberg/miniwdl that referenced this issue Dec 4, 2018
Removes a few workarounds / special cases for static validation of optional & nonempty type compatibility. They are now checked more rigorously by default, so for example any use of `T?` where `T` is expected requires an explicit coercion such as with select_first(). But to accommodate older WDLs which depend on less strictness, supplying WDL.load(..., check_quant=False) relaxes these static validation checks.

The WDL specification is vague as to how strict this typechecking should be; this compromise should let us accommodate older WDLs without necessarily reproducing every detail of what Cromwell does and doesn't enforce for different WDL versions. cf. openwdl/wdl#271

Expands the OptionalCoercion linter to warn of constructs that validate only when check_quant=False.

Closes #50
@mlin
Copy link
Member Author

mlin commented Feb 19, 2019

I have written a straw-man spec diff for debate & discussion: #290

@patmagee patmagee closed this as completed Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants