Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

Broadcasted not (.!) giving inconsistent return type when DataArray in expression #263

Open
joshbode opened this issue Jun 15, 2017 · 14 comments

Comments

@joshbode
Copy link
Contributor

joshbode commented Jun 15, 2017

The broadcasted not operator .! (new in 0.6) is returning results with unexpected type when used directly on expressions containing a DataArray, e.g.

isa(.!isna.(x), DataArray) == true

whereas, if the BitArray result is stored in an intermediate variable:

y = isna.(x)
isa(.!y,  BitArray) == true

Example:

0.6.0-rc3.0> using DataArrays
0.6.0-rc3.0> x = DataArray(1:3);

0.6.0-rc3.0> isna.(x)
3-element BitArray{1}:
 false
 false
 false
# looks correct

0.6.0-rc3.0> .!isna.(x)
3-element DataArrays.DataArray{Bool,1}:
 true
 true
 true
# return has become a DataArray{Bool}

0.6.0-rc3.0> y = isna.(x); .!y
3-element BitArray{1}:
 true
 true
 true
# return is a BitArray as expected

Possible explanation - the code expansion looks rather more complicated than expected:

0.6.0-rc3.0> expand(:(.!f.(x)))
:($(Expr(:thunk, CodeInfo(:(begin
        $(Expr(:thunk, CodeInfo(:(begin
        global ##19#20
        const ##19#20
        $(Expr(:composite_type, Symbol("##19#20"), :((Core.svec)()), :((Core.svec)()), :(Core.Function), :((Core.svec)()), false, 0))
        return
    end))))
        $(Expr(:method, false, :((Core.svec)((Core.svec)(##19#20, Core.Any), (Core.svec)())), CodeInfo(:(begin
        #temp#@_3 = f(#temp#@_2)
        return !#temp#@_3
    end)), false))
        #19 = $(Expr(:new, Symbol("##19#20")))
        SSAValue(0) = #19
        return (Base.broadcast)(SSAValue(0), x)
    end)))))

versus:

0.6.0-rc3.0> expand(:(f.(x)))
:((Base.broadcast)(f, x))

0.6.0-rc3.0> expand(:(.!y))
:((Base.broadcast)(!, y))

which possibly suggests it is an upstream issue, independent of DataArrays.

@joshbode
Copy link
Contributor Author

Two workarounds:

0.6.0-rc3.0> .!begin isna.(x) end
3-element BitArray{1}:
 true
 true
 true

0.6.0-rc3.0> .!(_ = isna.(x))
3-element BitArray{1}:
 true
 true
 true

@ararslan
Copy link
Member

The complex code expansion I believe is expected since this is a fusing dot operation. Note that you get something similar with expand(:(a .+ b .- c)), for example.

As for this bug (and the fact that those workarounds work!), that is bizarre...

@joshbode
Copy link
Contributor Author

joshbode commented Jun 15, 2017

I'm not sure the expansion is correct... it involves the expression-head :composite_type for example, which sounds like the dot might be being interpreted as field access perhaps? Don't know enough about the parser, but looks strange to me.

Also, for your example, the same trick works:

0.6.0-rc3.0> expand(:(a .+ begin b .- c end))
:($(Expr(:thunk, CodeInfo(:(begin
        # meta: location REPL[50] # REPL[50], line 1:
        # meta: pop location
        SSAValue(0) = (Base.broadcast)(-, b, c)
        return (Base.broadcast)(+, a, SSAValue(0))
    end)))))

Reported upstream, in any case - I don't think this is specifically a DataArrays issue.

@ararslan
Copy link
Member

Indeed. Thanks for the report. In the meantime I'll see if there's anything we can do here to work around the weird return type.

@joshbode
Copy link
Contributor Author

No worries. In the meantime, I guess !isna.(x) does work - though !(::BitArray) is now deprecated (which is how I got here in the first place!).

@ararslan
Copy link
Member

broadcast(!, isna.(x)) has the right return type (though oddly enough broadcast(!isna, x) does not), so you could do that for now if you want to avoid the deprecation warnings.

@joshbode
Copy link
Contributor Author

Good idea - thanks

@nalimilan
Copy link
Member

I think this is JuliaLang/julia#19313. The problem is that multiple operations using the dot syntax are fused into a single one by the parser, so broadcast is not called on isna, but on x -> !isna(x), and the special-casing to return a BitArray rather than a DataArray{Bool} isn't used.

@joshbode
Copy link
Contributor Author

Yep - looks like the same issue

@ararslan
Copy link
Member

Unfortunately with fused broadcast there doesn't appear to be an easy way to specialize on a combination of functions, so we'll have to find another way to make .!isna.(x) return the desired BitArray...

@nalimilan
Copy link
Member

I don't think there's a solution with the current system unfortunately. Except undeprecating isna(x)...

@joshbode
Copy link
Contributor Author

In the meantime, apart from not being able to entirely control the return vector type, I'm appreciating how awesome the broadcast-dot-syntax and loop fusion is :)

@ararslan
Copy link
Member

I don't think we should remove the implicitly vectorized isna deprecation, since then we're in conflict with how things are done in Base. Until we can figure something out, be it a change in Base for 0.7 or some way to work around it in 0.6, we can just recommend map(!, isna.(x)) (equivalent and shorter than broadcast(!, isna.(x))), I guess. :/

@nalimilan
Copy link
Member

I don't think undeprecating vectorized isna is a real problem. Thanks to the deprecation, all places which could use the new syntax have been changed now, and people are familiar with it. So we could silently reintroduce it for this problematic case, and deprecate it after 0.7 will have provided a way to fix this. broadcast(!, isna.(x))) is kind of hard to discover.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants