Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode bound information as type parameter #102

Merged
merged 8 commits into from
Jun 12, 2020
Merged

Conversation

omus
Copy link
Collaborator

@omus omus commented Jun 10, 2020

When updating the Intervals.jl to support unbounded ranges I encoutered an issue where the interface we had made it difficult to define unbounded ranges. For example if we update Inclusivity to have unbounded fields then constructor would take 4 bools: left open/closed, right open/closed, left unbounded, right unbounded:

Interval(1, 2, Inclusivity(false, true, true, false))  # `(,2]`

(Aside: note we still store a value for the unbounded side as we want Interval to be a bits type. The value here ultimately is ignored as it is unbounded)

This interface isn't very intuitive and needs some revision. While looking into reducing storage size of intervals I discovered that we can reduce the size of an interval in memory by having the bound information encoded in the the type parameters. The clearest way to do this is to use symbols. Using the same example as above we could have:

Interval{:unbounded, :closed}(1, 2)

With the option to alternatively construct the same interval with in the future:

Interval(nothing, 2)

This PR currently implements minimal working changes that support encoding bound information as type parameters. At this point there is no support for unbounded ranges. If we like this interface I can update the remainder of the code. If we don't like this I could work on an alternative interface such as:

Interval(1, 2, :unbounded, :closed)

@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

Documentation is currently causing test failures

@iamed2
Copy link
Member

iamed2 commented Jun 10, 2020

I would suggest making them singleton struct types instead of symbols, since I think they're easier for the compiler to optimize out. Plus then you could have:

abstract type Endpoint end
abstract type Bounded <: Endpoint end
struct Unbounded <: Endpoint end
struct Open <: Bounded end
struct Closed <: Bounded end

and then we could define methods which only work on bounded intervals more cleanly. We could also have:

struct AnchoredInterval{P, T, L, R} where {L<:Bounded, R<:Bounded} <: AbstractInterval{T, L, R}

@iamed2
Copy link
Member

iamed2 commented Jun 10, 2020

I like this overall. It's nice to be able to say const ClosedInterval = Interval{T, Closed, Closed} or const Everything = Interval{T, Unbounded, Unbounded}.

Plus it seems unlikely that you'll want to store packed arrays with different inclusivities, which is the thing you need to worry about when adding type parameters.

@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

I would suggest making them singleton struct types instead of symbols, since I think they're easier for the compiler to optimize out. Plus then you could have:

abstract type Endpoint end
abstract type Bounded <: Endpoint end
struct Unbounded <: Endpoint end
struct Open <: Bounded end
struct Closed <: Bounded end

and then we could define methods which only work on bounded intervals more cleanly. We could also have:

struct AnchoredInterval{P, T, L, R} where {L<:Bounded, R<:Bounded} <: AbstractInterval{T, L, R}

I can give this a try again. In a previous incarnation I wanted to do this and also use symbols in constructors as a convenient syntax but I ran into some issues. I filed an issue on this: JuliaLang/julia#36213

@codecov
Copy link

codecov bot commented Jun 10, 2020

Codecov Report

Merging #102 into master will decrease coverage by 26.30%.
The diff coverage is 42.22%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #102       +/-   ##
===========================================
- Coverage   96.25%   69.94%   -26.31%     
===========================================
  Files           8        8               
  Lines         267      376      +109     
===========================================
+ Hits          257      263        +6     
- Misses         10      113      +103     
Impacted Files Coverage Δ
src/deprecated.jl 1.85% <0.00%> (-38.15%) ⬇️
src/description.jl 100.00% <ø> (ø)
src/Intervals.jl 100.00% <100.00%> (ø)
src/anchoredinterval.jl 98.21% <100.00%> (ø)
src/endpoint.jl 93.54% <100.00%> (+0.44%) ⬆️
src/inclusivity.jl 100.00% <100.00%> (ø)
src/interval.jl 97.19% <100.00%> (-0.06%) ⬇️
src/plotting.jl 94.11% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91e7474...f24a406. Read the comment docs.

@iamed2
Copy link
Member

iamed2 commented Jun 10, 2020

I commented on that issue. I don't think your use of type parameters there was sound.

@nickrobinson251
Copy link
Contributor

I would suggest making them singleton struct types

Is there any reason to have some singletons as structs? Can they all be abstract types?

@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

I would suggest making them singleton struct types

Is there any reason to have some singletons as structs? Can they all be abstract types?

Making them struct makes them leaf types. I don't expect we'd want to make subtypes of those bounds.

@iamed2
Copy link
Member

iamed2 commented Jun 10, 2020

It also means we can instantiate them, which can be easier for passing and dispatching on argument types. All traits in base Julia are like that.

@nickrobinson251
Copy link
Contributor

nickrobinson251 commented Jun 10, 2020

It seems a little wonky to me to be able to write Unbounded() but not Bounded() (and/or to mix dispatching on ::Type{T} with dispatching on ::T).

@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

I should note that Endpoint is already a type so we'll end up with this for the initial version which doesn't have Unbounded support:

abstract type Bound end
struct Open <: Bound end
struct Closed <: Bound end

When including Unbounded we could have:

abstract type Bound end
abstract type Bounded <: Bound end
struct Unbounded <: Bound
struct Open <: Bounded end
struct Closed <: Bounded end

I think I'd probably leave out Bounded to start with.

@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

I should also point out that IntervalSets.jl uses Interval{L,R,T} where L and R are symbols. I think Interval{T,L,R} is a better ordering as I expect Interval{T} will be used more often when it comes to dispatch. Additionally this order happens to be non-breaking for Intervals.jl but it is still good to discuss our ideal parametric ordering.

@omus omus mentioned this pull request Jun 10, 2020
@omus
Copy link
Collaborator Author

omus commented Jun 10, 2020

As the response to this proposed change has been positive I'll complete the remainder of the work:

  • Add deprecation warnings
  • Update code to no longer use deprecated functions

src/Intervals.jl Outdated Show resolved Hide resolved
@omus omus changed the title WIP: Encode bound information as type parameter Encode bound information as type parameter Jun 11, 2020
@omus
Copy link
Collaborator Author

omus commented Jun 11, 2020

PR should now be complete and can be reviewed at this time. I'll call out the couple of minor negative points to do with this change:

  1. Interval{Open, Closed}(1, 2) works as you'd expect but Interval{Open, Closed} won't get you the type you expect. Instead you need to write Interval{T,Open,Closed} where T
  2. The HE and HB constructors have no way of supplying an alternative bound types. I don't believe this an issue and HourEnding{L,R}(...) and HourBeginning{L, R}(...) works as you expect (Note: the HE and HB perform ceil/floor to the nearest hour)

end

function Base.convert(::Type{Interval{T}}, interval::AnchoredInterval{P, T}) where {P, T}
return Interval{T}(first(interval), last(interval), inclusivity(interval))
function Base.convert(::Type{Interval{T}}, interval::AnchoredInterval{P,T,L,R}) where {P,T,L,R}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a situation where this function is used over the one above? or more so, would there ever be a difference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signatures of these two functions is identical with the updated AnchoredInterval type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need them both then?

Copy link
Collaborator Author

@omus omus Jun 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misinterpreted your original question. The actual answer to your initial question is that there is a difference between the two functions. One is called as convert(Interval, ...) which is just "convert this to a subtype of Interval" where convert(Interval{Int64}, ...) is "convert this to a subtype of Interval where the element type is Int64".

We could combine these so the signature is:

Base.convert(::Union{Type{Interval}, Type{Interval{T}}, ...)

I may make that change in another PR.

Copy link
Member

@fchorney fchorney Jun 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just me trying to wrap my head around multiple dispatch for certain things. I get that one is just strictly Interval and the other is Interval{T}, but doesn't the Interval{T} one stipulate that the T has to be the same type as the T in the AnchoredInterval{P,T,L,R}?, and therefore you wouldn't be able to do something like convert(Interval{Float64}) if the T in the AnchoredInterval was an Int64 per se? Perhaps I am misunderstanding, and both T's don't actually need to match?

src/anchoredinterval.jl Outdated Show resolved Hide resolved
src/description.jl Show resolved Hide resolved
@nickrobinson251
Copy link
Contributor

The HE and HB constructors have no way of supplying an alternative bound types

🤷 Different issue, but HE and HB are the least nice interface in this package already #105

@omus
Copy link
Collaborator Author

omus commented Jun 11, 2020

Just applied Fernando's change and corrected the doctests

Copy link
Member

@fchorney fchorney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@omus omus merged commit 78b125a into master Jun 12, 2020
@omus omus deleted the cv/parametric-bounds branch June 12, 2020 12:33
@omus omus mentioned this pull request May 26, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants