-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: add Histogram types #6601
RFC: add Histogram types #6601
Conversation
+1 for a Histogram type |
+1 for a Histogram type and to @jiahao's suggestion. |
Okay, there's now a single The question is now what to do with matrix arguments: if
|
One other thing to think about: which way should we round? (i.e. should the bins be upper- or lower-inclusive)? At the moment, we round down, i.e. with an edge vector
This is the same as the R default (which has the option for either), but the opposite of numpy (which doesn't allow a choice). The historical reason for this choice was that pre-
However now this shouldn't be a problem, either way should give reasonable answers. If we want to allow the option of either, a neat way to do this would be to add another type parameter for |
I'm not sure it matters much, but it seems to me it's more common to use intervals closed on the left and open on the right, i.e. |
I've added an optional However I think it makes sense to keep the current behaviour (
|
+1 for moving this into |
is = if h.interval == :right | ||
map((edge, x) -> searchsortedfirst(edge,x) - 1, h.edges, xs) | ||
else | ||
map(searchsortedlast, h.edges, xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is an argument for having searchsortedfirst
return 1 less by default? cf. #5664
@ViralBShah That would perhaps be the easiest in terms of upgrade path, as we could deprecate the existing functionality gradually without breaking too much. |
@simonbyrne As you like, though I'm not sure the comparison really has practical applications. But wouldn't it be more explicit to call the argument |
@nalimilan |
Not a big thing, but wouldn't |
@BobPortmann I had in mind that the same structure could be used for weighted observations, or could be normalised to one (which is why I didn't restrict the field to be |
So I take it that no one is opposed to moving this to StatsBase.jl? |
I would think so. |
+1 for putting histograms in StatsBase.jl |
Late to this party. Agree wholeheartedly with the move. |
Why isn't this closed? |
Because no one got to it. @simonbyrne want to do the honors, or is there something else here keeping this open? |
Bump. |
Sorry, been away. Yep, now in StatsBase. |
This adds explicit types for histograms. The main advantage is that we can define methods that operate on histograms, notably
plot
. This approach closely mirrors existing functionality, but there are some alternatives that may be worth thinking about:hist
to create both 1d and 2d versions: this could be done using tuples (similar to thekde
function in KernelDensity.jl).+
for this?