-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dispatch speedups #21760
dispatch speedups #21760
Conversation
Would these benchmarks be suitable to add to Nanosoldier? |
Yes, I don't see why not. Let's add them. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Ping @vtjnash --- what do you think? |
This feels unlikely to be valid, but I'll have to work a bit harder to understand it to see if I can create a good counterexample. |
It actually ended up being really simple. First I restored using either 0 or 1 as the offset based on whether the function type has parameters. Hopefully that's uncontroversial. Next, I observed that the |
This is true, but actually irrelevant. The question is whether qualifying for the split guarantees it is more specific than anything in the sorted list. The |
Yes, I did try to make this work more generally at any offset, but couldn't get it working since I didn't want to add another field to typemaps. I'm not sure we want to add another field, especially if this is backported to 0.6 which would be nice. |
Bump, would be nice to get this in 0.6 if it is determined to be sound. |
@vtjnash bump. Maybe we can put this on master and see how it goes? |
I'm still fairly certain that aaaf9f4 violates specificity ordering. The other two commits look ok.
Since you're using one field to mean two different things, the question isn't whether we add another field to disambiguate those, it's whether it would be valid to remove the |
I'll separate the two safer commits and then explore this. |
Ok, I came up with a different approach that's simpler and more general. I changed the meaning of the
Those cases should be correctly ordered, most specific first. If this checks out, the field should perhaps be renamed |
74a9a44
to
4146abc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does seem tantalizingly close to being usable. But as you noted in the code comment, it doesn't work because of Vararg. Is it going to be possible to resolve that?
Also, can you split out the other commits into a new PR. As much as I love re-reviewing code, it does get a bit hard to keep track of which parts of the diff are relevant to the new proposed optimization.
src/typemap.c
Outdated
assert(offs != lastleaf); | ||
if (tparams->unsorted) { | ||
// if we couldn't split on this offset but can split on a later one, skip this slot. | ||
// Only do this for sorted maps, since there are cases where an apparently non-leaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsorted
src/typemap.c
Outdated
jl_value_t *ttypes = jl_unwrap_unionall((jl_value_t*)types); | ||
int offs, l = jl_field_count(ttypes); | ||
|
||
for(offs = l-1; offs >= 0; offs--) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with it?
src/gf.c
Outdated
@@ -126,7 +126,7 @@ const struct jl_typemap_info method_defs = { | |||
0, &jl_method_type | |||
}; | |||
const struct jl_typemap_info lambda_cache = { | |||
0, &jl_method_instance_type | |||
1, &jl_method_instance_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, when reviewing earlier I mistook which cache this was and thought it was the tfunc / specializations cache. The lambda (method table) cache requires sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. but it surely doesn't require sorting in the same sense --- this doesn't cause any test failures. We might need another flag telling the typemap that only simple signatures will be inserted (i.e. the stuff we put in method caches, basically leaf types and Any
s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typemap figures that out on its own already and by-passes sorting when not applicable. I'm not too surprised that our tests don't manage to hit this case – we have a few other sorting bugs that I suspect we would usually hit first, limiting our ability to demonstrate the sorted-ness of this cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe such a flag would enable the optimization I'm attempting here. It's only safe to split on the second argument of (::AbstractThing, ::Int)
if we know (::AbstractThing{N}, ::Vararg{Int,N})
is not going to exist.
Just look at the newest commit ( I think the solution is to use this trick only for unsorted maps (EDIT: or method caches, which use a highly restricted set of types). Method caches are far, far more performance-critical than sorted method lists, and I highly doubt they will ever be able to handle anything as complex as |
This restores the ability to start splitting typemaps at either argument 0 or 1, depending on whether the function type has parameters.
@vtjnash : I removed the change to the |
Uses the `any` cache to skip all non-leaf slots when a later slot is splittable.
return 0; | ||
return jl_typemap_visitor(cache.node->any, fptr, closure); | ||
return jl_typemap_node_visitor(cache.node->linear, fptr, closure); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't change the visitation order unless you know that jl_typemap_info->simplekeys value is set. This method defines the sort order for MethodTable.
return 0; | ||
return jl_typemap_intersection_visitor(map.node->any, offs+1, closure); | ||
return jl_typemap_intersection_node_visitor(map.node->linear, closure); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
@@ -448,7 +448,7 @@ typedef struct _jl_typemap_level_t { | |||
struct jl_ordereddict_t arg1; | |||
struct jl_ordereddict_t targ; | |||
jl_typemap_entry_t *linear; // union jl_typemap_t (but no more levels) | |||
union jl_typemap_t any; // type at offs is Any | |||
union jl_typemap_t any; // type at offs is skipped; will be split later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're changing the meaning of this field, can you reorder this list to show the sort order and rename it something else.
I think I would still like to keep the any
field also, as a possible means of splitting Method tables for incremental deserialization. Scanning these tables is currently almost all of the cost there. There's fairly few instances of this type (probably a couple thousand tops), and they're already usually expected to be gigantic (several hundred to several thousand bytes).
@@ -1056,7 +1068,7 @@ jl_typemap_entry_t *jl_typemap_insert(union jl_typemap_t *cache, jl_value_t *par | |||
newrec->isleafsig = newrec->issimplesig = 0; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well add an executable assertion on the meaning of simplekeys
here:
assert((!simplekeys || newrec->issimplesig) && "bad insert")
Bump? I agree, this speedup would be nice :) |
Unless this is blocking something, I think we should wait until post 1.0 to merge this. |
Test script: https://gist.github.com/JeffBezanson/a5f4abd6f093795f7d8c41501fb94d8b
release-0.6:
This PR:
The first result in the list is probably just luck from changing the order of the table. Speeding up
convert
dispatch is not really solved here; I managed to come up with a hack that only works for constructors so far.The fix for #21370 sacrificed 0-argument constructors (#21730) and functions whose types have parameters. This hopefully fixes that, while keeping the fix for #21370.
@nanosoldier
runbenchmarks(ALL, vs=":master")