Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "None" to spec #263

Merged
merged 18 commits into from
Mar 7, 2019
Merged

Add "None" to spec #263

merged 18 commits into from
Mar 7, 2019

Conversation

rhpvorderman
Copy link
Contributor

After running into yet another case where None is definitely needed (or otherwise causing hideous workarounds) I decided to at it as an example in this pull request.

This aims to implement things that were discussed in #209

@orodeh
Copy link
Contributor

orodeh commented Nov 9, 2018

I agree, this has been missing for a while. None is, in my opinion, a better choice than Null.

👍

@antonkulaga
Copy link

I think that you should have started with Some/None support instead of null-s from the very begining

@cjllanwarne
Copy link
Contributor

cjllanwarne commented Nov 13, 2018

  1. I totally agree with the need for a None literal! Thanks for making this PR.
  2. Instead of only mentioning None all the way down on line 2280 (not your fault - the length of the spec is a known issue...), I think I would add this concept in a new Optional Literals section right after Pair Literals (see line 26 in the ToC) and talk about how any non-optional value can be automatically wrapped into an optional (eg Int? maybe_integer = 5), and how None can be used as a literal to specify no value.
  3. Side comment - for optional File outputs specifically (your example) - I think that eventually we'll want a way of expressing this using an engine function like File? logfile = optional_file("test.log") - this (a) eliminates the need for coordinating behavior through the command and the outputs using the boolean logfile, and (b) works in more general cases where we cannot predict using an if expression which Files will and won't be produced.
    • But for now, the example is great,
    • And None is useful in all sorts of situations so please don't take this comment as a reason not to add Nones!

@rhpvorderman
Copy link
Contributor Author

@cjllanwarne and @pshapiro4broad thank you for your feedback. I edited the pull request.

What does everyone think about retroactively adding this to the 1.0 spec? I think that we should do this. The concept is already there. But since there is no literal for it you cannot use it easily yet. The change will be backwards compatible.

@geoffjentry
Copy link
Member

There are no retroactive changes

@rhpvorderman
Copy link
Contributor Author

@geoffjentry . Okay, that's settled then. This pull request only affects the development spec.

@geoffjentry
Copy link
Member

Yeah we’ve bent the rules a few times for truly inconsequential changes like typos in prose as well as bugs in the grammar file but otherwise trying to not introduce retroactive changes. Thanks

@pshapiro4broad
Copy link
Contributor

pshapiro4broad commented Nov 20, 2018

Thanks for making the changes, it looks better to me. Do you want to include an example that makes it explicit that xxx != None is equivalent to defined(xxx)?

@rhpvorderman
Copy link
Contributor Author

Thanks @yfarjoun @pshapiro4broad and @cjllanwarne for your excellent comments. These should now be addressed.

test ~{true="--log" false="" log} inputFile outputPath
}
output {
File? logFile = if log then "test.log" else None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually needed? what happens now if you just have

  File? logFile = "test.log"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then logFile will always be test.log even if there is no test.log created.

Boolean test_defined2 = defined(maybe_five_and_is) # Evaluates to true
Boolean test_is_none = maybe_five_but_is_not == None # Evaluates to true, same as !definedmaybe_five_but_is_not)
Boolean test_not_none = maybe_five_but_is_not != None # Evaluates to false, same as defined(maybe_five_but_is_not )
Boolean compare_int_to_none = certainly_five == None # This will cause an error, since None only makes sense when dealing with optionals.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a runtime error or a validation error?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this should be an error.

What about

Int? maybe_five = 5
Int five = 5
Boolean compare_five_to_maybe_five = maybe_five == five

Should this validate?
I think it should and that the answer should be True.

but this means that one should be able to compare an optional with a non-optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #271 for general discussion of typechecking of optional values (i.e. when use of an optional value may be a static validation error, a runtime error in case actually null, or not an error at all). I think the ambiguities involved are broader than this more-focused PR for adding None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we move to tie off this PR, perhaps we should punt on this example for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Not rushing this in through the backdoor. I will remove this.

The `None` literal is the value that an optional has when it is not defined.

```WDL
Int? maybe_five_but_is_not = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what the non-scalawags think - is None intuitive here?

It is to me, but I'm used to Scala's Option, which is what this construct was modeled after.

I don't know that I have any suggestion which seems better (going back to "it is intuitive to me"), perhaps Empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python, there is also a None. It also means something like 'empty' and calling a None variable, will give a not defined error. So for people who know python, it can be intuitive (it is for me, but I also know scala and I can not judge for other people).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think null is a better choice as that's how you write this concept in JSON, which is how WDL inputs are represented. That way the same concept looks the same in both paces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a deep loathing of null in general, but you raise a good point.

One thing worth thinking through is if there's a meaningful difference between "This space intentionally left blank" and "For unknown reasons there's nothing here" in WDL-land. When those are both viable states I think that null should always mean the latter, and only the latter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assigning null to something is a contradiction in itself. So, in my preference null is not available as a variable in the toolbox.

As for the "then it will look the same as in JSON" argument. You have a point there, but frankly, None is different from null. Using the same name for a different concept does not sound like a good idea to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing worth thinking through is if there's a meaningful difference between "This space intentionally left blank" and "For unknown reasons there's nothing here" in WDL-land. When those are both viable states I think that null should always mean the latter, and only the latter

Can you explain how these two different states could arise in WDL? They sound to me like a states in the mind of the WDL user, not the WDL interpreter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assigning null to something is a contradiction in itself. So, in my preference null is not available as a variable in the toolbox.

Remember that, in my proposal, null is the name WDL has for the concept of an unassigned value. It has no meaning with respect to the implementation of WDL, or pointers, or other programming language concepts.

As for the "then it will look the same as in JSON" argument. You have a point there, but frankly, None is different from null. Using the same name for a different concept does not sound like a good idea to me.

Are you saying that using null in JSON has a different effect than using (as you propose) None in WDL? If so, how is this difference detectable from inside WDL?

In my proposal, null is not different than None as they are just different spellings of the same concept. Maybe the point that you're trying to make is that null has baggage due to its meaning in other programming languages. I agree that it could be a hurdle for WDL users that have experience programming in other languages. Is this the argument that you're making here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pshapiro4broad

Can you explain how these two different states could arise in WDL?

I don't have an example, just thought it was worth throwing it out there in case it prompted someone to think of one

In my proposal, null is not different than None as they are just different spellings of the same concept.

I don't think they necessarily are, going back to the difference I cited above. For instance in R NA and NULL are two very difference concepts, but they'd both fold into null in many languages which only have the one concept.

Maybe the point that you're trying to make is that null has baggage due to its meaning in other programming languages.

I certainly fall under that category (I've protested against adding "the billion dollar mistake" to WDL in the past as well), but to go back earlier in the convo I also admit I'm a fairly biased witness here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geoffjentry @pshapiro4broad Most of the "Null vs None" stuff was already discussed in #209. Most people liked None more, so I settled for that.

@rhpvorderman
Copy link
Contributor Author

What needs to be done to get this PR ready for voting? I would like to get this into the spec at some point. There's been a lot of discussion about the PR, but I suggest we get down to voting this thing. If it is insufficient, it is back to the drawing board for me. But then at least there is a decision being made.

@cjllanwarne
Copy link
Contributor

@rhpvorderman my only complaint with this PR as-is is the use of an optional file output in one of the examples (ie File? maybe_file = ).

I think I'd personally rather keep this PR as the title suggests (allowing None to be used as a literal).

I definitely think that optional File outputs should work - but I'd rather make that explicit in a separate PR which can go through refinement in case there's a nicer way to express it. I'm mostly concerned that we don't let File? delocalization implementation issues on cloud platforms slow down the implementation and adoption of a should-be-unrelated literal for empty optionals.

test ~{true="--log" false="" log} inputFile outputPath
}
output {
File? logFile = if log then "test.log" else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the line which I think could be trickier to guarantee. I think for this PR that this example should be removed or reworked to not use File?, and a new PR specifically guaranteeing support for File? outputs should be added (in case there are alternative ways or syntax options of expressing this intention in an easier-for-cloud-support way)

## Document

```
$document = ($import | $task | $workflow)+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you really have more than one workflow in a single document?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why are we making these additions in this PR? seems out of place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an accidental reversal of #270 happened during a rebase to me

Copy link
Contributor

@cjllanwarne cjllanwarne Dec 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or maybe this PR just need another rebase on master to pick up that missing PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely did not put that in 🙂. I will try if a rebase fixes it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhpvorderman Heads up that I don't think your rebasing did the trick as this line is still here.

@@ -23,6 +23,8 @@
* [Map Literals](#map-literals)
* [Object Literals](#object-literals)
* [Pair Literals](#pair-literals)
* [Optional Literals](#optional-literals)
* [Document](#document)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs to be deleted

If `log` is true, the command will evaluate to `test --log inputFile outputPath` and `logFile` will be `test.log`.
If `log` is false, the command will evaluate to `test inputFile outputPath` and `logFile` will be `None`.

## Document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs to be deleted

@rhpvorderman
Copy link
Contributor Author

I personally don't think that saying "git rebases are hard" is a good reason to push ahead despite these concerns

[tangent] Squashing and rebasing are gruesome practices that destroy git history. It is nice in some circumstances, but when overused you get problems like I have had here. Having to clean up all sorts of stuff that I never wrote from my PR. This was a lot of tedious and unnecessary work. Maybe this was because my rebasing went wrong because I am not a git guru. But in general why resort to guru stuff when safer and easier ways are available? But let's have that discussion elsewhere. I just wanted to note that "git rebases are hard" is an oversimplification of my viewpoint in this matter. [/tangent]

With the exception of the File? output example, I'm very much in favor of the addition of a None literal. However I would not vote for this PR as it is right now

Thanks @cjllanwarne for providing a list of arguments why rushing in optional files through the backdoor of this PR is a bad idea. You convinced me. I have removed the optional file example.

@cjllanwarne
Copy link
Contributor

Thanks @rhpvorderman! And FWIW I agree with your meta comment that if something is making iteration on these PRs difficult for contributors, that's definitely worth thinking about to see if there's anything in the git flow to make that easier.

@geoffjentry my concerns are met, happy to vote on this PR now.

@rhpvorderman
Copy link
Contributor Author

@mlin I have removed the typechecking example from the PR.

@geoffjentry
Copy link
Member

Sorry for the delay on this, I've been out of office for the last few weeks. Opening for a vote now.

@orodeh
Copy link
Contributor

orodeh commented Jan 7, 2019

👍


Boolean test_defined = defined(maybe_five_but_is_not) # Evaluates to false
Boolean test_defined2 = defined(maybe_five_and_is) # Evaluates to true
Boolean test_is_none = maybe_five_but_is_not == None # Evaluates to true, same as !definedmaybe_five_but_is_not)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a left parenthesis after defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mwalker174 - unfortunately once voting is active votes are on the PR as-is. I'd encourage either you or @rhpvorderman file a followup PR after this one is finished

@mwalker174
Copy link

👍

4 similar comments
@aednichols
Copy link
Contributor

👍

@DavyCats
Copy link
Contributor

DavyCats commented Jan 8, 2019

👍

@mlin
Copy link
Member

mlin commented Jan 8, 2019

👍

@cjllanwarne
Copy link
Contributor

👍

@cjllanwarne
Copy link
Contributor

@geoffjentry - This PR has been implemented. No implementation issues came up. This PR can now be merged into the SPEC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.