-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
port to Julia 0.7 #182
port to Julia 0.7 #182
Conversation
- Remove Nullables and DataValues - colnames now returns a tuple - Remove NamedTuples dependency - Get test_core.jl to pass
I think we should switch to I would also like to get rid of |
Yes, that sounds right. Basically, do the same thing as
I would say that's the case. At least |
src/table.jl
Outdated
@@ -28,7 +27,7 @@ struct NextTable{C<:Columns} <: AbstractIndexedTable | |||
# Cache permutations by various subsets of columns | |||
perms::Vector{Perm} | |||
# store what percent of the data in each column is unique | |||
cardinality::Vector{Nullable{Float64}} | |||
cardinality::Vector{Any} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Union{Float64,Missing}
?
TableTraits.jl (and the broader Queryverse.jl family of packages) will continue to use I'm kind of surprised that you aren't running into some of the same issues? This is essentially still the same set of problems that we discussed in JuliaData/Missings.jl#6 and https://discourse.julialang.org/t/missing-data-and-namedtuple-compatibility/8136. Have you benchmarked this branch? TableTraits.jl is ported to 0.7, do you want me to open a PR that reenables the TableTraits.jl integration? |
.travis.yml
Outdated
@@ -4,7 +4,7 @@ os: | |||
- linux | |||
- osx | |||
julia: | |||
- 0.6 | |||
- nightly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add 0.7
and 1.0
as well.
Bump. It would be great to get this merged soon. |
David's performance concern here is valid --- it might be better to do everything but the Missing change first. |
@davidanthoff I think @piever's iterator based approach with |
It would certainly be helpful if some benchmarks were actually performed. @piever, do you happen to have any comparisons you did in doing the |
Of course. I was just commenting on the point about not being able to use Iterators + missing. |
Ah Okay I hadn't schooled myself on that discourse discussion, my bad. |
@piever's code is great, but I wouldn't expect it to solve the performance problems. I might be wrong, so benchmarking and comparing would be a good idea. |
At the time I mainly tested that it didn't lose performance compared to the previous inference based implementation in the type stable case: as OTOH, I think that JuliaData/Tables.jl#10 is managing to combine the good ideas from On a related note, I also seem to remember that some functions had yet to be ported to the new iteration framework ( |
v = [@NT(a = 1, b = 2), @NT(a = 1, b = 3)] | ||
@test collect_columns(v) == Columns(@NT(a = Int[1, 1], b = Int[2, 3])) | ||
v = [(a = 1, b = 2), (a = 1, b = 3)] | ||
@test collect_columns(v) == Columns((a = Int[1, 1], b = Int[2, 3])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have a constructor for Columns
using keyword args, so a level of parens can be dropped.
@test collect_columns(v) == Columns(a = Int[1, 1], b = Int[2, 3])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is such a constructor; maybe this is just checking that passing a (named)tuple also works.
Yes, that is the scenario I would worry about. Doesn't seem like a corner case to me :) I think in general a good strategy would be to get this working on julia 0.7, with the minimal set of change, and then think about broader redesigns. No need to couple those two decisions and thereby delay everything by a lot. |
Just pushed a commit that gets this further for 0.7/1.0, some remaining test failures involve:
Can anybody else pick things up from here? If not, I can try to take another crack or two over the next few days. |
This is also affecting StatPlots, FWIU it is |
The |
Yes, I haven’t gotten the broadcasting stuff to work for DataValues.jl on 0.7 yet. Not sure that is used anywhere here, though, just a heads up. |
BTW I did a simple map with tables of n missable/DataValue columns. It does scale exponentially with Nice work @quinnj ! I'll try to fix up the tests tomorrow (they still use missing and there are a couple of failures related to reflection). You can continue to fix other things... |
@quinnj did you happen to test the scaling of your |
I wouldn’t expect unknown Schemas to be the problem, I think iterators that produce streams of values with heterogeneous types are the culprit (and those you easily get in a projection with |
Got tests passing locally. |
OK, this fails on 0.7 due to the conflicting export of |
443f2e3
to
890e303
Compare
Now with more TableTraits. |
@@ -431,23 +427,28 @@ end | |||
|
|||
struct ApplyColwise{T} | |||
functions::T | |||
names::Vector{Symbol} | |||
names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
Brought back the disabled tests. This should be pretty good now. |
Merge? |
Replace DataValues.jl with
Union{Missing,T}
and NamedTuples.jl with native named tuples.Get all but TableTraits.jl code to pass tests.
cc @davidanthoff I have commented tabletraits.jl code for now. Are you planning to make it possible to use it on 0.7 with
Union{Missing, T}
?cc @quinnj
@piever maybe
_is_subtype
is now kind of redundant sinceUnion{Missing, T}
can be tested with<:
for the same property?