Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalize unions made in subtyping #49276

Merged
merged 2 commits into from
Apr 7, 2023
Merged

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Apr 6, 2023

We observed a case where simple_tmeet made a Union of egal things, which is undesirable. There also was no sorting of the result, as it normally done, and theoretically, simplification with an omit_bad_union to remove S could similar result in a Union that should be further simplified to remove redundancies.

Union{Union{Val{T}, S} where T<:AbstractString, Union{Val{T}, Int64} where T<:AbstractString} where S

(In principle, that simplification might also be possible to do during the original jl_type_union call when flattening it.)

This prevents the bad inference path that occurs in #48228 from happening, without fixing the type system issue there (which can be reproduced on master by starting inference at one of the later stack frames where the complex unions are already present)

@vtjnash vtjnash added the types and dispatch Types, subtyping and method dispatch label Apr 6, 2023
@vtjnash vtjnash requested a review from N5N3 April 6, 2023 16:22
@vtjnash vtjnash marked this pull request as ready for review April 6, 2023 16:22
@gbaraldi
Copy link
Member

gbaraldi commented Apr 6, 2023

What does that union simplify to then?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Apr 6, 2023

Without simplification, there may be 2 copies of Val{T} where T<:AbstractString in the result

@gbaraldi
Copy link
Member

gbaraldi commented Apr 6, 2023

Oh, so the normalization here moves the where out because it's the same for both parts of the union, got it.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Apr 6, 2023

The normalization here replaces S with Union{}, but that other simplification I am working on now.

src/jltypes.c Outdated
Comment on lines 544 to 545
(!has_free && !jl_has_free_typevars(temp[j]) &&
jl_subtype(temp[i], temp[j]))) {
Copy link
Member

@N5N3 N5N3 Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should follow the rule below (see subtype path in simple_join)?
Although this path has been turned off

// issue #24521: don't merge Type{T} where typeof(T) varies
        !(jl_is_type_type(a) && jl_is_type_type(b) && jl_typeof(jl_tparam0(a)) != jl_typeof(jl_tparam0(b)))) {

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. I moved that here. I wanted to be careful not to make too big of a change at once, in case something in a package was depending on it significantly doing one thing. Maybe we should try to move more of that simple_tmeet function into simple_union, but I think a couple things are heuristic based there, so maybe not all of it should be combined.

Copy link
Member

@N5N3 N5N3 Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think a couple things are heuristic based there, so maybe not all of it should be combined.

I think a basic rule is keeping the pointer identity if possible?
#49277 seems to be a good template and if we do that, I think we can move all type-specific branch into simple_union and let simple_join handle non-type bits along.

BTW, I didn't measure the performance influence.
But if we decided to turn this path on, then a good optimization to have is skipping all check if temp[i] and temp[j] come from the same part.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of this check mostly, since I don't know if this is accurate for a Union in general:

    if (jl_is_typevar(a) && obviously_egal(b, ((jl_tvar_t*)a)->lb))
         return a;

That may be a good thing to check for during jl_type_union too to save repeated work in the case of repeatedly appending one element (e.g. Union{Union{Union{A, B}, C}, Union{D, E}}, since we only need to check the new elements against the old ones). A starting point for that might be to alloca a second buffer that holds the index of which where union starts from to fill the buffer. I think the trickiest ones to detect are cases like this, since "obviously" it can return the first object, but since we only do a subtype check (not equality), we may conclude the result should be that of prepending A to the second Union, and fail to realize that will give back the first Union:

Union{ Union{A,B,C}, Union{B,C} }

That suggests perhaps we should also do the same exhaustive sort of check afterwards too, as templated in #49277, to see if all the elements remaining are equivalent to an existing one. Once we merge this and #49277, I can try prototyping that additional check.

Copy link
Member

@N5N3 N5N3 Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of this check mostly, since I don't know if this is accurate for a Union in general:

I think it's accurate for Union, at least for those without free vars?
But since Union doesn't normalize it for now, I think we can leave these 2 branches in simple_join.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That suggests perhaps we should also do the same exhaustive sort of check afterwards too, as templated in #49277

Hoist some egal check here seems good?
Then we just need to count how many remaining temp comes from a or b.

We observed a case where simple_tmeet made a Union of egal things, which
is undesirable. There also was no sorting of the result, as it normally
done, and theoretically, simplification with an omit_bad_union to remove
`S` could similar result in a Union that should be further simplified to
remove redundancies.
```
Union{Union{Val{T}, S} where T<:AbstractString, Union{Val{T}, Int64} where T<:AbstractString} where S
```
(In principle, that simplification might also be possible to do during
the original jl_type_union call when flattening it.)
@vtjnash vtjnash force-pushed the jn/subtype-simplify-tmeet-union branch from 7e49d0a to 2e66297 Compare April 7, 2023 13:53
@vtjnash vtjnash merged commit 02704d9 into master Apr 7, 2023
@vtjnash vtjnash deleted the jn/subtype-simplify-tmeet-union branch April 7, 2023 18:30
Xnartharax pushed a commit to Xnartharax/julia that referenced this pull request Apr 19, 2023
We observed a case where simple_tmeet made a Union of egal things, which
is undesirable. There also was no sorting of the result, as it normally
done, and theoretically, simplification with an omit_bad_union to remove
`S` could similar result in a Union that should be further simplified to
remove redundancies.
```
Union{Union{Val{T}, S} where T<:AbstractString, Union{Val{T}, Int64} where T<:AbstractString} where S
```
(In principle, that simplification might also be possible to do during
the original jl_type_union call when flattening it: see JuliaLang#49279)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
types and dispatch Types, subtyping and method dispatch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants