Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete_method: remove method from any pre-existing ambiguities #28916

Merged
merged 1 commit into from
Sep 10, 2018

Conversation

timholy
Copy link
Member

@timholy timholy commented Aug 27, 2018

Fixes #28899. Kinda amazing we never noticed this til now.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ambiguity doesn't disappear, it should simply be no longer applicable during lookup

@timholy
Copy link
Member Author

timholy commented Aug 27, 2018

So you mean the fix should be here?

julia/src/gf.c

Lines 2006 to 2016 in 3ab56f1

JL_DLLEXPORT int jl_has_call_ambiguities(jl_value_t *types, jl_method_t *m)
{
if (m->ambig == jl_nothing)
return 0;
for (size_t i = 0; i < jl_array_len(m->ambig); i++) {
jl_method_t *mambig = (jl_method_t*)jl_array_ptr_ref(m->ambig, i);
if (!jl_has_empty_intersection(mambig->sig, types))
return 1;
}
return 0;
}

Since we don't store the max_world in the method itself, wouldn't you have to walk the method table's typemapentry and find instances of mambig? Doesn't that introduce an O(MN) step, where M is the number of methods previously determined to be possibly-ambiguous (i.e., length(m.ambig)) and N is the number of methods in the typemapentry? Or perhaps we should just get rid of the ambig field altogether and recompute everything by type intersection every time?

What's the downside of just removing it from m.ambig? (Maybe someone will reset the max_world and "undelete" the method?) Stated differently, what's the purpose of caching this in m.ambig at all?

@timholy
Copy link
Member Author

timholy commented Aug 28, 2018

Bump @vtjnash. I'm willing to implement it but I need to know more about why this isn't the right solution. AFAIK the ambig field is just a list of methods that should be checked by future dispatch events, and if the method has been deleted I am failing to understand why it would need to be checked.

@vtjnash
Copy link
Member

vtjnash commented Aug 28, 2018

Since we don't store the max_world in the method itself, wouldn't you have to walk the method table's typemapentry and find instances of mambig?

We can just store it there—we already have copies of many of the other fields (sig, min-world, name). Currently, it seems like the purpose of Method is essentially just to contain the information about how to find it in the method table.

checked by future dispatch events

It's only supposed to check them for dispatch events in the applicable world (e.g. if someone is using a generated function, any later change in method definitions shouldn't be visible to them)

@timholy
Copy link
Member Author

timholy commented Aug 28, 2018

OK, but redesigning Method seems a bit beyond the scope of this PR since it affects the whole compiler chain.

So, is reasonable to fix this by implementing the O(NM) algorithm?

@vtjnash
Copy link
Member

vtjnash commented Aug 28, 2018

Redesign, hm? Just add the field. That won't change anything about Method.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2018

I'm just envisioning changes sprinkled throughout src/ (e.g., dump.c) but perhaps I'm being overly pessimistic. I'll give it a whirl.

@vtjnash
Copy link
Member

vtjnash commented Aug 28, 2018

Mostly just need to revert a529199#diff-ab8c4a99930a43168794a7bd3b8bfa91

println(linfo.def)
println(typeof(linfo.def))
println(typeof(m.source))
println(m.source)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a make debug, this debugging output produces lots of lines like (nb I've manually shortened the arrays)

print(Core.CoreSTDOUT, Type{Core.MethodInstance})
Core.print(...)
Method
Array{UInt8, 1}
Array{UInt8, (66,)}[0x00, 0x03, 0x00, 0x00, 0x00, ...]
show(Core.CoreSTDOUT, Type{Core.MethodInstance})
Core.show(...)
Method
Array{UInt8, 1}
Array{UInt8, (119,)}[0x00, 0x03, 0x00, 0x00, ...]

which all seem fine.

Then we get to the fatal error

validate_code_in_debug_mode(Core.MethodInstance, Core.CodeInfo, String)
Core.Compiler.validate_code_in_debug_mode(...)
Method
Core.MethodInstance
Array{UInt8, (452,)}[0x00, 0x07, 0x00, 0x00, ...]
error during bootstrap:
LoadError(at "compiler/compiler.jl" line 3: LoadError(at "compiler/bootstrap.jl" line 8: MethodError(f=typeof(Core.Compiler.copy_code_info)(), args=(Array{UInt8, (452,)}[0x00, 0x07, 0x00, 0x00, ...],), world=0x0000000000000e5d)))

Notice that m.source is a MethodInstance, which shouldn't happen and is inconsistent with the actual value that is printed. Interestingly, you can explain this if this is an off-by-one bug in the field offset and you're picking up unspecialized instead. Since I added a field to method_t, this seems plausible. Know of any places where field offsets are likely to be hard-coded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it:

jl_svecset(jl_method_type->types, 10, jl_method_instance_type);

Just happened to stumble across it while searching.

@timholy timholy force-pushed the teh/fix_28899 branch 2 times, most recently from aae182b to f0b56a8 Compare August 30, 2018 09:58
@timholy
Copy link
Member Author

timholy commented Aug 30, 2018

OK, ready for another round of review.

@timholy
Copy link
Member Author

timholy commented Aug 31, 2018

...and now with bonus features (aka, passing tests even with FORCE_ASSERTIONS=1 😉)

Is it possible to run this against the package ecosystem and see whether adding that field to Method breaks anything? Or does that happen only when someone tries to do the backport?

@quinnj
Copy link
Member

quinnj commented Aug 31, 2018

I think w/ @Keno 's new PkgEval it should be possible, though I'm not sure how much we worry about the number packages that actually pass tests currently. I think we have a critical mass that a PkgEval run would given decent signal.

@ararslan
Copy link
Member

ararslan commented Sep 2, 2018

FWIW NewPkgEval takes less time to run than any of our current CI builds.

@timholy
Copy link
Member Author

timholy commented Sep 2, 2018

How does one run it? If there's a bot it's one of the many discussions I've missed.

@ararslan
Copy link
Member

ararslan commented Sep 2, 2018

How does one run it?

The "one" must be @Keno and he must be brought a sacrificial offering.

@KristofferC KristofferC mentioned this pull request Sep 3, 2018
@timholy
Copy link
Member Author

timholy commented Sep 3, 2018

he must be brought a sacrificial offering

OK, that's easy: I'll find a rare but serious bug that is triggered by a combination of LLVM, our inlining engine, and 32bit that nevertheless can't be fixed without rewriting both Julia's compiler and all of LLVM. That should keep him happy for at least 3 days.

@timholy
Copy link
Member Author

timholy commented Sep 10, 2018

@vtjnash, any chance you can take a second look at this? Revise has started working around this bug whenever it can, but we should still fix the core problem. I think this is ready to merge, but I won't do so without your approval.

@vtjnash vtjnash merged commit 513db98 into master Sep 10, 2018
@vtjnash vtjnash deleted the teh/fix_28899 branch September 10, 2018 18:38
KristofferC pushed a commit that referenced this pull request Sep 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants