-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up the package? #310
Comments
What about it is cumbersome? Splitting part that doesn't depend on anything else might be ok though I don't think depwarn fix is a good reason for it and I don't see how splitting will help there. The depwarn fix now needs to be down on multiple packages rather than one. |
That wasn't really an argument for splitting up so much as just a comment on the fact that it's big enough that fixing depwarns is annoying to do all at once. |
I also think this package is unnecessarily large. It can reasonably be changed to simply re-export the exports of smaller packages which become dependencies, to maintain the current convenience while allowing more modular development |
Exactly what I was thinking, @TotalVerb. |
Yes, please lets split it up. Breaking it into separate repos, will make it easier to decruft it. Like All the Sorted stuff (see breakdown in next post) was written by @StephenVavasis . There are basically 5 packages living in here, all with there own core author list. I spent an hour or so checking and breaking down a potential split of the package. |
This is just the files in /src/ Heaps.jlAnything to do with heaps.
SortedDataStructures.jlIts basically data structures that have optimised performance because they
SetDataStructures.jlThey all act like sets. (Fast element of-checking)
AssociativeDataStructures.jl
LinearDataStructures.jl
These 5 packages are basically not interdependent. |
Good ideas. I'm not sold on the specific names but otherwise +1. |
Mostly true, but note that an |
Ah yes, I missed that one. @ararslan and I had a discussion on specific names the other day. |
Pro for splitting may be maintainablility, but since not in Base, maybe for discorverability, having all the main data-structures in one place is good. Just one question on it. For using (e.g. precompiling) is the size slowing down? I mean does Julia need to compile more than you actually use of a package? If it does now, can it be deferred in a future version of Julia? |
Yes, and probably? |
Ok, Split according to the splits covered by #310 (comment) DataStructures.jl will remain as a MetaPackage that does all of them. So the actual concrete plan: The branch with the combined meta package is a bit harder. Then once that is all set up, everything can cruise along for a little while. Once we are happy all is well, Then we push all the split branches of to their own remotes, and register them. We close all issues and PRs here. We then delete all the tests in DataStructures.jl, and maybe tag Also somewhere in this plan needs to be sorting out docs. So that is the plan. (@vanbujm this was the complex git game I was talking to you about a few weeks back) |
No. I disagree. If it comes to it we can always move commits around later via the magic of git. |
The first split package Heaps.jl is up at https://github.com/JuliaCollections/Heaps.jl I am not porting anything deprecated across. It is now clear to me we will either need a DataStructuresBase.jl For my own reference the commands I need to remove things out of git history and compact it are:
|
FWIW, I'd like to use a |
@rofinn I can create that and add you to it |
Thanks |
@rofinn sorry that I didn't get on with and finish the split. Do you think Tries are good separate from other MembershipCollections? |
Would those implementations likely need to share code or define some common abstract parent type? IMHO, folks usually just want a specific data structure for their use case, so I'd only be in favour of that if it promoted some common API or avoided code duplication. |
Another motivating reason for splitting packages is to save time on benchmarking. |
No Heaps.jl should just be recreated. However, I have been thinking about this a bit more and I am no longer sure it is worth splitting up. |
That sounds reasonable. Shall we close this issue? Regarding the benchmarking concern, we can do the following:
It would be really cool if PkgBenchmark could select the benchmarks to run based on code changes, but I don't believe that capability is on the roadmap. |
This package has become quite large and is something of a heavyweight dependency for packages that just need a single type. An example of that is StatsBase, which only uses DataStructures for the arrays as heaps stuff. Indeed, just going through and trying to fix the deprecation warnings in this package for more recent Julia versions has proved quite cumbersome. It seems to me that for the sake of maintainability, the easily separable pieces of this package could be split off into separate packages, say like Heaps.jl, Deques.jl, etc. That way packages can just pull in the functionality they need. Thoughts?
The text was updated successfully, but these errors were encountered: