Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a parametric Nullable{T} type #8152

Merged
merged 1 commit into from
Sep 20, 2014
Merged

Add a parametric Nullable{T} type #8152

merged 1 commit into from
Sep 20, 2014

Conversation

johnmyleswhite
Copy link
Member

This adds a parametric Nullable{T} type that can represent a value of type T that may be missing. It's got a very minimal interface with the hope that this will encourage you to resolve the uncertainty of whether a value is missing as soon as possible.

Work left to do:

I also did a little bit of whitespace removal along the way, which is hopefully forgivable.

@JeffBezanson
Copy link
Member

👍

Is the error behavior of == part of the "encourage you to resolve the uncertainty" design?

I think we could get by without Null and NotNull. It will be simpler in the long run.

@johnmyleswhite
Copy link
Member Author

Yeah, the error behavior was based on my thought that, since we should raise errors when comparing anything with a null value, it's easier to just not try comparing Nullable objects at all than to wait for a run-time error when your equality comparison hits its first null.

Agree that we can just use Nullable. Null and NotNull are an inheritance from a time when this code tried to imitate a standard Option type more closely.

@porterjamesj
Copy link
Contributor

Once this is merged it's probably worth being very clear in the manual section precisely what the semantic differences are between Null{T} and Nothing. Having two types to represent absence is sure to be a point of confusion for many.

@ivarne
Copy link
Member

ivarne commented Aug 27, 2014

We will have 3 concepts for nothing. Nothing/nothing, None/Void and NullableTypes. Looking forward to make http://docs.julialang.org/en/latest/manual/faq/#nothingness-and-missing-values even more complicated.

@porterjamesj
Copy link
Contributor

Right, I forgot about None and Void.

@ivarne
Copy link
Member

ivarne commented Aug 27, 2014

I wonder if get is the correct function to overload in this case. Currently get is part of the interface for collections, and Null{T} does not feel like a collection to me. Maybe a new function val or value?

@toivoh
Copy link
Contributor

toivoh commented Aug 27, 2014

Well, it's a collection with zero or one element. Perhaps it should be
iterable and maybe even indexable. Not sure what that implies with regards
to get, though.
On Aug 27, 2014 8:44 AM, "Ivar Nesje" notifications@github.com wrote:

I wonder if get is the correct function to overload in this case.
Currently get is part of the interface for collections, and Null{T} does
not feel like a collection to me. Maybe a new function val or value?


Reply to this email directly or view it on GitHub
#8152 (comment).

@rfourquet
Copy link
Member

I find the name slightly misleading, as a nullable value is immutable and as such can not be nulled after construction. I re-read rapidly the thread on julia-users, and I'm not sure if the question of supporting both "ontological/statistical missingness" as been decided. But if Nullable only supports statistical, someone will come request an Optional{T} (a 4th concept of nothing) for ontological missingness. I wouldn't implement differently Optional from Nullable, so why not support both with the same type? Morever @johnmyleswhite's view that "I’ve come to really like the interpretion of Option{T} [now Nullable{T}] as a 0-or-1 element container type" is the very concept behind C++'s (ontological) optional.

@rfourquet
Copy link
Member

With the view of container with 0-or-1 elements, I would personally find getindex(x::Nullable) = get(x) quite natural, similar to Jameson Ref{T} type and to C++'s optional dereference operator. And it could allow to bypass the ugly unsafe_get via @inbounds.

@nalimilan
Copy link
Member

Regarding ==: what's the suggested pattern to compare two Nullable? If that's get(x) == get(y), then I don't see why you wouldn't implement x == y as an equivalent, shorter syntax.

@JeffBezanson
Copy link
Member

The main advantage of get is that you can specify a default value.

I don't think an Option type or this Nullable type can or should specify exactly what value-missingness means. It's for any case where there could be a value, but there isn't one right now.

In C# Nullable is a value type, so mutability is not supposed to be implied here. But I think it would be ok to call this Option or Optional instead.

The name None has become very unfortunate since it is so confusingly different from python. Python's None is actually our nothing. Our None should be renamed not to sound like a generic null-ish value. We could use VoidType or EmptyType. Nothing should probably also be renamed NothingType or something like that, so the nothing/Nothing distinction is clearer. (A related change is that the Void alias for ccall should really refer to NothingType and not None.)

@StefanKarpinski
Copy link
Member

I actually very much like our choices of names for None, Nothing and nothing and have found that people are mostly not confused by them since they are so semantically apt. Perhaps Python should change it's naming instead ;-)

@kmsquire
Copy link
Member

Regular expressions matching would be another good target for this once
it's merged.

On Wednesday, August 27, 2014, Stefan Karpinski notifications@github.com
wrote:

I actually very much like our choices of names for None, Nothing and
nothing and have found that people are mostly not confused by them since
they are so semantically apt. Perhaps Python should change it's naming
instead ;-)


Reply to this email directly or view it on GitHub
#8152 (comment).

@JeffBezanson
Copy link
Member

Yes the names are apt, but I have seen people use Nothing instead of nothing several times and I can hardly blame them.
None is guilty of stealing a short, generic word for something that you almost never use, and should almost never use. Its aptness only gets it partially off the hook.

@johnmyleswhite
Copy link
Member Author

Responses to various comments:

  • Should we allow ==? I'm not inclined to encourage people to compare Nullable objects. I'd rather that they test isnull and only compare values if actual values exist. == is kind of nuts no matter which perspective we take since all three possible implementations are weird: (1) == always raises an error, (2) == raises an error if at least one of the inputs is null, (3) == returns a Nullable{Bool} object when you do any comparison so that NULL == NULL => NULL. All seem bad. Universally raising errors seemed least bad.
  • There's not meant to be any direct commitment to either an epistemological or an ontological view of missingness here. To commit to an epistemological view, you need to implement three-valued logic, which I'd rather not do. Because this code mostly raises errors, it's closer to an ontological view. But I'd rather just offer a building block that lets you implement either.
  • I don't see a big need to implement functionality like getindex or iteration for Nullable since it won't increase the expressivity of the construct. But I could do it for symmetry if desired.
  • I personally think Nullable is a better name because it makes it easier to see that this construct is the basis for representing NULL. It is a little strange that you can't "null" a Nullable object, but I think it's at least defensibly strange.
  • I'm very much in favor of renaming None to EmptyType and Nothing to NothingType. The current names make those types seem much more useful than they are.

@StefanKarpinski
Copy link
Member

I agree with the choice of Nullable as most intuitive to the most people. Option is a weirdly broad term that only suggests the right meaning to a very small set of people who will be using this. None doesn't actually need a name – we can write it as Union(). I don't care for renaming Nothing – yes, people misuse it, but that's going to happen.

@johnmyleswhite
Copy link
Member Author

Updated:

  • Tests run now
  • Null and NotNull are both replaced with Nullable

@johnmyleswhite
Copy link
Member Author

I think the fact that nothing and Nothing differ only in case is the source of misuse, though. Imagine if int were a value of type Int.

@JeffBezanson
Copy link
Member

+1 to John's responses.

I kind of like the idea of using only Union(). That should help clarify the un-useful nature of the beast. Adding lots of names for things was fun back when there were only 200 of them, but now removing names is a much greater virtue.

@johnmyleswhite
Copy link
Member Author

+1 to Union

@IainNZ
Copy link
Member

IainNZ commented Aug 27, 2014

None => EmptyType is not bad though, or NoneType, or anything verbose, because you rarely if ever want to type it.

@IainNZ
Copy link
Member

IainNZ commented Aug 27, 2014

If we change the meaning of x = [] where will users even see None?

@JeffBezanson
Copy link
Member

Here's an interesting idea: rename Nothing to Void. In C, void is for things that return but don't return a value, which is what we use nothing for. A ccall with return type Void actually returns nothing in julia. There are various hacks in the system to patch around the fact that None is not actually the correct type to map C's void. I used None at first because I figured Nothing would be a lie --- the C code does not return a julia nothing value. But None is a worse lie. We should just fix this.

Void is much less problematic because it's close to the C usage, and doesn't collide with expectations from other dynamic languages.

@johnmyleswhite
Copy link
Member Author

I like the mix of Void and Union() a lot.

@StefanKarpinski
Copy link
Member

So nothing is the singleton instance of Void? And None is just Union()? Not bad. I worry that the difference between nothing and Void is going to be very confusing though. In C, void is the type with no instances, so it's pretty confusing that it would have an instance in Julia. Ptr{Void} would no mean what it means now.

@JeffBezanson
Copy link
Member

That's a fair point --- in the case of Ptr{Void}, Ptr{None} is actually correct: dereferencing it is an error. We could instead hack in an error for dereferencing Ptr{Nothing}, but then of course the hack has just moved elsewhere.

Interestingly the following C code seems to be legal:

void f() {
    void *p = 0;
    return *p;
}

int main() {
    f();
    return 0;
}

This compiles, runs, and doesn't segfault. So giving nothing and not touching memory when deref'ing a Ptr{Void} is actually not so different from C.

@johnmyleswhite
Copy link
Member Author

Updated with a draft of a manual section. I'm really bad with RST, so please make sure I haven't done anything very stupid. In particular, I'm worried about the interaction of doctest with a snippet of code that's supposed to throw errors.

@StefanKarpinski
Copy link
Member

So, I'm a bit concerned that Nullable(T) where T is a type is ambiguous: did you want Nullable{T}() or Nullable{DataType}(T)?

@JeffBezanson
Copy link
Member

I also think Nullable{T}() is perfectly analogous to Dict{T,S}(). They both basically make empty containers. They should match; if one is bad then the other is bad too, and we need a different convention for empty containers.

@johnmyleswhite
Copy link
Member Author

I think that's a good argument for @eschnett's proposed solution.

@quinnj
Copy link
Member

quinnj commented Sep 19, 2014

@JeffBezanson, note my proposal was Null{T}(x::Nullable{T}) = isnull(x) ? x : Null(T) with Null, not Nullable, meaning you would always get a null Nullable back when using Null, whether called on a null Nullable or non-null Nullable.

If we go with the Null and NotNull constructors, I don't see why Null{T}(x::T) = Nullable{T}() couldn't be had (in addition to the special case for Nullable above).

julia> NullableTypes.Null{T}(x::T) = Nullable{T}()
Null (generic function with 2 methods)

julia> Null(1)
Null(Int64)

julia> Null(Int)
Null(Int64)

@JeffBezanson
Copy link
Member

You seem determined to introduce some f(x) whose behavior is subtly different based on whether x is a type. I simply don't see the advantage of this. Even if one considers it acceptable, I don't see how one can argue it is the simplest and least confusing option.

The danger here is that Nullable is extremely generic, parametrically polymorphic to the max: it makes equal sense for absolutely any value.

The function oftype of course has a similar sketchiness: oftype(1,1.0) and oftype(Int,1.0) both work. I don't love that either, but we can just barely get away with it because converting 1.0 to typeof(Int) doesn't make sense. However Null(x) easily makes sense for all x, so there is not much reason to sometimes take the type of x and other times not.

@quinnj
Copy link
Member

quinnj commented Sep 19, 2014

No, that makes sense. Particularly the arguments for simplicity and the power of Nullable. I'd vote then to go with the Nullabe{T}() and Nullable(x::T) options. Looking forward to kicking the tires on this some more (for the ODBC and SQLite packages).

@kmsquire
Copy link
Member

I also think Nullable{T}() is perfectly analogous to Dict{T,S}(). They
both basically make empty containers. They should match; if one is bad then
the other is bad too, and we need a different convention for empty
containers.

I had argued for this change before:
#4871 (comment)

I think it would be better to have a consistent convention for
creating typed containers (Arrays, Dicts, Sets, and the various containers
in DataStructures.jl). Currently, Dicts and Sets are special.

@JeffBezanson
Copy link
Member

See also #3214. I dislike things like Container(T) more and more, since it's totally unclear which are type parameters and which are elements. The plan is for Array to remain the lone exception until #1470 is fixed.

@eschnett
Copy link
Contributor

It's probably way too late in the discussion to bikeshed the name of the type... I don't like the name Nullable, as this implies that one can perform a certain action on the respective object. For example, Comparable would imply that == is defined, and Printable would indicate that the type can be output.

Nullable does not indicate such a property; an nval::Nullable{Int} is an immutable object, and there is no operation e.g. null(nval) that would modify nval. Also, the notion of null is tied to C and pointers, which is very different from the implementation here, which is more efficient.

Haskell calls this type Maybe (you have maybe an int, and maybe you have nothing) -- a cool name, but it takes a bit getting used to. Boost calls it Optional (you may have an int, or you may not) -- this is probably a good name that everybody immediately understands.

I like Optional. To check whether an optional value is present, one could call a function ispresent (instead of isnull).

@kmsquire
Copy link
Member

+1 for consistency and Nullable{T}() then!

@JeffBezanson
Copy link
Member

@eschnett I agree with everything you've said in this thread, including that it is too late to bikeshed the name :)

I'm actually not particularly attached to Nullable, and would be ok with Maybe or Optional or perhaps Opt if you're into the whole brevity thing. But as I said above Nullable is a well-established term of art that does not imply mutability. Interestingly, your examples Comparable and Printable also do not involve mutation. Nullable does in fact imply certain non-mutating methods, like isnull and get. Your argument actually supports the position that -able is not tied to mutation; it does not support your stated position.

@nalimilan
Copy link
Member

@eschnett I think one of the points in favor of the Nullable term is that in SQL missing values are called NULL, and dealing with missing data is one of the big interests of this new type. Nullable is also called that way in C# and Java, though Option and Maybe seem to be equally popular, according to Wikipedia.

@eschnett
Copy link
Contributor

The term "Nullable" indicates that there is some kind of operation that the object supports, namely "nulling" it. This does not really indicate that (a) there is a function isnull, or (b) there may not be a value present. That's what I meant when I spoke about modifying -- the term "nulling" sounds as if something could be modified, and that's not the case. I didn't mean to imply that the suffix "-able" indicates mutability, as you agree.

I guess we come from different programming language backgrounds. When I compare https://en.wikipedia.org/wiki/Nullable_type and https://en.wikipedia.org/wiki/Option_type, then I'd place Julia firmly in the latter category...

@JeffBezanson
Copy link
Member

I read the Nullable type article, and nowhere does it mention an operation of "nulling" a value. The Option type article says "Outside of functional programming, these are known as nullable types." That seems to mean different kinds of programming have different names for the same thing, not that there are different kinds of option types (e.g. mutable vs immutable).

We agree that "X is nullable" does not imply that X supports some mutating operation. Why then would you say that "nullable" implies "nulling", which is a mutating operation? Maybe "nulling" means "constructing a similar value that is null". So I think the term is reasonable, making some allowance for the limitations of human language.

@johnmyleswhite
Copy link
Member Author

FWIW, I think the argument about names is not likely to prove fruitful.

First off, the English suffix "-ble" does not precommit the earlier morpheme to any specific interpretation: compare livable, visible, defensible, potable, etc. Some of these involve transitive verbs, but some do not.

Second off, our type isn't identical to an Option or Maybe, since it's not a tagged union, but a distinct parametric type. This a somewhat minor point, but using a distinct name will help to keep the type theory folks from complaining about the use of terms that they perceive to have highly specialized meanings.

@JeffBezanson
Copy link
Member

I agree the naming debate is not very fruitful, I was just starting to enjoy it :)

@JeffBezanson
Copy link
Member

With the Nullable{T}() change I would like to merge this.

@johnmyleswhite
Copy link
Member Author

Updated to use Nullable{T}(). Should be good to go now.

@johnmyleswhite
Copy link
Member Author

And Travis gives us the green light.

@johnmyleswhite
Copy link
Member Author

Bump.

JeffBezanson added a commit that referenced this pull request Sep 20, 2014
Add a parametric Nullable{T} type
@JeffBezanson JeffBezanson merged commit 7f47e6b into master Sep 20, 2014
@IainNZ
Copy link
Member

IainNZ commented Sep 20, 2014

🍰

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.