Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "None" to spec #263

Merged
merged 18 commits into from
Mar 7, 2019
Merged
58 changes: 55 additions & 3 deletions versions/development/SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
* [Map Literals](#map-literals)
* [Object Literals](#object-literals)
* [Pair Literals](#pair-literals)
* [Optional Literals](#optional-literals)
* [Versioning](#versioning)
* [Import Statements](#import-statements)
* [Task Definition](#task-definition)
Expand Down Expand Up @@ -258,7 +259,7 @@ $float = (([0-9]+)?\.([0-9]+)|[0-9]+\.|[0-9]+)([eE][-+]?[0-9]+)?

### Types

In WDL *all* types represent immutable values.
In WDL *all* types represent immutable values.
- Even types like `File` and `Directory` represent logical "snapshots" of the file or directory at the time when the value was created.
- It's impossible for a task to change an upstream value which has been provided as an input: even if it makes changes to its local copy the original value is unaffected.

Expand Down Expand Up @@ -300,7 +301,7 @@ For more information on type and how they are used to construct commands and def
#### Numeric Behavior

`Int` and `Float` are the numeric types.
`Int` can be used to hold a signed Integer in the range \[-2^63, 2^63).
`Int` can be used to hold a signed Integer in the range \[-2^63, 2^63).
`Float` is a finite 64-bit IEEE-754 floating point number.

#### Custom Types
Expand Down Expand Up @@ -689,6 +690,33 @@ Pair values can also be specified within the [workflow inputs JSON](https://gith
}
```

### Optional literals

Any non-optional value can be stored into an optional variable by assigning it to an optional variable.
This does not change the value, but it changes the type of the value:

```WDL
Int? maybe_five = 5
```

The `None` literal is the value that an optional has when it is not defined.

```WDL
Int? maybe_five_but_is_not = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what the non-scalawags think - is None intuitive here?

It is to me, but I'm used to Scala's Option, which is what this construct was modeled after.

I don't know that I have any suggestion which seems better (going back to "it is intuitive to me"), perhaps Empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python, there is also a None. It also means something like 'empty' and calling a None variable, will give a not defined error. So for people who know python, it can be intuitive (it is for me, but I also know scala and I can not judge for other people).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think null is a better choice as that's how you write this concept in JSON, which is how WDL inputs are represented. That way the same concept looks the same in both paces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a deep loathing of null in general, but you raise a good point.

One thing worth thinking through is if there's a meaningful difference between "This space intentionally left blank" and "For unknown reasons there's nothing here" in WDL-land. When those are both viable states I think that null should always mean the latter, and only the latter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assigning null to something is a contradiction in itself. So, in my preference null is not available as a variable in the toolbox.

As for the "then it will look the same as in JSON" argument. You have a point there, but frankly, None is different from null. Using the same name for a different concept does not sound like a good idea to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing worth thinking through is if there's a meaningful difference between "This space intentionally left blank" and "For unknown reasons there's nothing here" in WDL-land. When those are both viable states I think that null should always mean the latter, and only the latter

Can you explain how these two different states could arise in WDL? They sound to me like a states in the mind of the WDL user, not the WDL interpreter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assigning null to something is a contradiction in itself. So, in my preference null is not available as a variable in the toolbox.

Remember that, in my proposal, null is the name WDL has for the concept of an unassigned value. It has no meaning with respect to the implementation of WDL, or pointers, or other programming language concepts.

As for the "then it will look the same as in JSON" argument. You have a point there, but frankly, None is different from null. Using the same name for a different concept does not sound like a good idea to me.

Are you saying that using null in JSON has a different effect than using (as you propose) None in WDL? If so, how is this difference detectable from inside WDL?

In my proposal, null is not different than None as they are just different spellings of the same concept. Maybe the point that you're trying to make is that null has baggage due to its meaning in other programming languages. I agree that it could be a hurdle for WDL users that have experience programming in other languages. Is this the argument that you're making here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pshapiro4broad

Can you explain how these two different states could arise in WDL?

I don't have an example, just thought it was worth throwing it out there in case it prompted someone to think of one

In my proposal, null is not different than None as they are just different spellings of the same concept.

I don't think they necessarily are, going back to the difference I cited above. For instance in R NA and NULL are two very difference concepts, but they'd both fold into null in many languages which only have the one concept.

Maybe the point that you're trying to make is that null has baggage due to its meaning in other programming languages.

I certainly fall under that category (I've protested against adding "the billion dollar mistake" to WDL in the past as well), but to go back earlier in the convo I also admit I'm a fairly biased witness here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geoffjentry @pshapiro4broad Most of the "Null vs None" stuff was already discussed in #209. Most people liked None more, so I settled for that.

# maybe_five_but_is_not is an undefined optional
Int? maybe_five_and_is = 5
# maybe_five_and_is is a defined optional
Int certainly_five = 5
# Certainly five is not an optional

Boolean test_defined = defined(maybe_five_but_is_not) # Evaluates to false
Boolean test_defined2 = defined(maybe_five_and_is) # Evaluates to true
Boolean test_is_none = maybe_five_but_is_not == None # Evaluates to true, same as !definedmaybe_five_but_is_not)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a left parenthesis after defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mwalker174 - unfortunately once voting is active votes are on the PR as-is. I'd encourage either you or @rhpvorderman file a followup PR after this one is finished

Boolean test_not_none = maybe_five_but_is_not != None # Evaluates to false, same as defined(maybe_five_but_is_not )
rhpvorderman marked this conversation as resolved.
Show resolved Hide resolved
Boolean compare_int_to_none = certainly_five == None # This will cause an error, since None only makes sense when dealing with optionals.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a runtime error or a validation error?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this should be an error.

What about

Int? maybe_five = 5
Int five = 5
Boolean compare_five_to_maybe_five = maybe_five == five

Should this validate?
I think it should and that the answer should be True.

but this means that one should be able to compare an optional with a non-optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #271 for general discussion of typechecking of optional values (i.e. when use of an optional value may be a static validation error, a runtime error in case actually null, or not an error at all). I think the ambiguities involved are broader than this more-focused PR for adding None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we move to tie off this PR, perhaps we should punt on this example for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Not rushing this in through the backdoor. I will remove this.

```


## Versioning

For portability purposes it is critical that WDL documents be versioned so an engine knows how to process it. From `draft-3` forward, the first non-comment statement of all WDL files must be a `version` statement, for example
Expand Down Expand Up @@ -795,7 +823,7 @@ task t {

#### Task Input Localization
`File` and `Directory` inputs must be treated specially since they require localization to within the execution directory:
- Files and directories are localized into the execution directory prior to the task execution commencing.
- Files and directories are localized into the execution directory prior to the task execution commencing.
- When localizing a `File`, the engine may choose to place the file wherever it likes so long as it accords to these rules:
- The original file name must be preserved even if the path to it has changed.
- Two input files with the same name must be located separately, to avoid name collision.
Expand Down Expand Up @@ -2232,6 +2260,7 @@ task test {
Array[File]+ b
Array[File]? c
#File+ d <-- can't do this, + only applies to Arrays
Array[File]+? e # An optional array that, if defined, must contain at least one element
}
command {
/bin/mycmd ${sep=" " a}
Expand Down Expand Up @@ -3229,10 +3258,33 @@ Array[String] env2_param = prefix("-f ", env2) # ["-f 1", "-f 2", "-f 3"]

Given an array of optional values, `select_first` will select the first defined value and return it. Note that this is a runtime check and requires that at least one defined value will exist: if no defined value is found when select_first is evaluated, the workflow will fail.

```WDL
version 1.0
workflow SelectFirst {
input {
Int? maybe_five = 5
Int? maybe_four_but_is_not = None
Int? maybe_three = 3
}
Int five = select_first([maybe_five, maybe_four_but_is_not, maybe_three]) # This evaluates to 5
Int five = select_first([maybe_four_but_is_not, maybe_five, maybe_three]) # This also evaluates to 5
}

## Array[X] select_all(Array[X?])

Given an array of optional values, `select_all` will select only those elements which are defined.

```WDL
version 1.0
workflow SelectFirst {
input {
Int? maybe_five = 5
Int? maybe_four_but_is_not = None
Int? maybe_three = 3
}
Array[Int] fivethree = select_all([maybe_five, maybe_four_but_is_not, maybe_three]) # This evaluates to [5, 3]
}

## Boolean defined(X?)

This function will return `false` if the argument is an unset optional value. It will return `true` in all other cases.
Expand Down