-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for high-performance codegen-less Reflection factory APIs #23716
Comments
Something I forgot to mention above - there's also a consideration for making the existing Reflection APIs faster. However, this requires further thought, and there is likely only so much we can do because each invocation of the APIs would need to perform both setup and invocation. We're also not able to change the observable behavior of the existing Reflection APIs. |
Sounds promising! I need to think more about how this fits all of our uses in Orleans (Serialization, RPC, Activation). This current proposal wouldn't be able to replace our existing codegen for creating proxy objects (where an interface whose methods return Will the field setters work on EDIT: currently the CLR seems to be relatively loose about how it enforces |
How about class Ref
{
int[] _items;
public ref int First => ref _items[0];
} I want APIs like the following public delegate ref TResult RefFunc<TArg, TResult>(TArg arg); public static RefFunc<object, TProperty> CreateRefPropertyGetter<TProperty>(PropertyInfo propertyInfo); |
|
Unless there's a glaring technical reason why it can't be done, shouldn't there be a mass of generic overloads for public static Func<TInstance, T1, T2, TReturn> CreateMethodInvoker<TInstance, T1, T2, TReturn>(MethodInfo methodInfo); Useful for situations where you know a given type has a method that conforms to a signature, but it doesn't have an interface you can use to access it directly. At least until Shapes are ready :) |
The proposal so far shows an API that looks convenient to use, I like it, specially as it would remove human error from the equation. Two question:
|
Rather than a "mass of generics", why not just:
|
I'm assuming (and it would good to be explicit) that other old-style behavior we don't want to support are:
From the comment on Also, I'd find it more convenience for these to be extension methods over the |
I suspect |
The Event and Property cases just boil down to retrieving the correct accessor method and calling |
This is an area where I play a lot. I've been down this road many times, and have switched "engines" many times - it is very time consuming to do so. For me, frankly the "real" answer here is to get better compile-time codegen tooling - so that our libraries hook into the build chain painlessly and emit appropriate code then, without consumers needing to jump though magic hoops and arcane incantations. In the absence of that... well, I can kinda see some benefit for greenfield scenarios, but except for the full and proper compile-time emit, personally I wouldn't feel overly compelled to try to change engine another time on an existing library. If this is a suggestion for a new MS / corefx API: frankly I'd much rather that time was spent giving us compile-time codegen. Same target scenario, better (IMO) result. Just my tuppence. |
Additionally: unless I'm mistaken, everything here is already possible via "expression trees" - which IIRC spoof |
additional additional: The API exposed is too basic and simplistic. It isn't sufficient to just provide delegates that implement property accessors. That's enough for casual usage, but so is regular unoptimized reflection :) Given your stated audience, the typical scenario is much closer to "emit a complex single method body that accesses 12 properties on 4 inputs, performs a series of complex operations on all those things (including several custom loops), then does 3 further operations with the results - and includes exception handling". The API proposed above doesn't even begin to touch on that. And then comes the killer word: edit: oh, and support |
I've built three serializers, ZeroFormatter(original format), MessagePack for C#(binary) and Utf8Json(json). My company is creating mobile game for iOS/Android by Unity so have to support both .NET(Core) and Unity(AOT/IL2CPP). runtime codegenfor serializer optimization, proposal api is not sufficient. // proposal design
// cost of outer accessor loop, cost of call delegate and can not avoid boxing.
foreach(var getterAccessor in accessors)
{
writer.Write(getterAccessor.Invoke(value));
}
// Current Utf8Json design, call member directly.
writer.WriteInt32(value.foo);
writer.WriteString(value.bar); pre-compiled codegenIn my area - Game, performance deterioration is not allowed. By the way, in my case, runtime codegen is better than pre-compiled codegen in performance. |
@migueldeicaza The idea is that for codegen-disallowed scenarios we’d run down a different code path that’s still significantly faster than what’s otherwise available using standard Reflection APIs. Consider creating an object using a simple parameterless ctor. In a codegen-disallowed world, we could implement this via a calli into the allocator followed by a calli into the constructor. This is basically what the newobj instruction gets JITted into anyway, so it would have similar performance to a newobj, but with the slight additional overhead of an indirection or two. |
@neuecc "but pre-compiled can not." well, there's only two to choose from... it probably wouldn't hurt badly to emit both; just a consideration from someone who feels the same pain points |
@neuecc Thanks for the insight into your scenario! I want to point out that one assumption you had is incorrect; these APIs do not require values to be boxed if you really want to avoid that. There are overloads that take and return non-object. You’d still incur the cost of the delegate indirection once per member instead of once per type, however. This shouldn’t show up as too bad a thing in profiler runs considering things like String-to-UTF8 conversion are far heavier than a simple virtual dispatch. |
@GrabYourPitchforks yes, but my sample assume foreach(var writeAction in writeActions)
{
// writeAction that creates by ExpressionTree(? how to create?) uses Func<TTarget, TField> accessor
writeAction(writer, value);
} |
// TContainer = type that contains the properties / fields to serialize
static class SerializationFactories<TContainer> {
public static Action<Writer, TContainer> CreateSerializerForField(FieldInfo fieldInfo) {
if (fieldInfo.FieldType == typeof(int)) {
return CreateIntSerializer(fieldInfo);
} else if (fieldInfo.FieldType == typeof(string)) {
// ...
} else {
return CreateObjectSerializer(fieldInfo);
}
}
private static Action<Writer, TContainer> CreateIntSerializer(FieldInfo fieldInfo) {
Utf8String fieldNameAsUtf8 = ...;
Func<TContainer, int> getter = ReflectionServices.CreateFieldGetter<TContainer, int>(fieldInfo);
return (writer, @this) => writer.WriteInt(fieldNameAsUTf8, getter(@this));
}
private static Action<Writer, TContainer> CreateLongSerializer(FieldInfo fieldInfo) { /* ... */ }
private static Action<Writer, TContainer> CreateStringSerializer(FieldInfo fieldInfo) { /* ... */ }
private static Action<Writer, TContainer> CreateObjectSerializer(FieldInfo fieldInfo) {
return (Action<Writer, TContainer>)typeof(SerializationFactories<TContainer>).GetMethod("CreateObjectSerializerCore").MakeGenericMethod(typeof(fieldInfo.FieldType)).Invoke(null, new[] { fieldInfo });
}
private static Action<Writer, TContainer> CreateObjectSerializerCore<TField>(FieldInfo fieldInfo) {
Utf8String fieldNameAsUtf8 = ...;
Func<TContainer, TField> getter = ReflectionServices.CreateFieldGetter<TContainer, TField>(fieldInfo);
return (writer, @this) => writer.WriteObject<TField>(fieldNameAsUtf8, getter(@this));
}
} |
@GrabYourPitchforks This only works if the set of types you care about is finite and fixed. |
Which is the case in the majority of the scenarios addressed by this proposal. Remember: this isn't trying to replace Reflection. (If you're trying to improve Reflection all-up, just make changes directly to the Reflection APIs and ignore this proposal.) The goal of this proposal is to make certain scenarios (namely, serialization and a small handful of others) easier for library authors to write in a high-performance manner that works both in codegen-allowed and in codegen-disallowed environments. For serialization, there is only a fixed, finite set of primitive types supported by any given protocol. Consider integers, strings, possibly binary and |
I see this proposal as a start of discussion how to improve Reflection all-up - we should attempt to cover as many holes in the existing reflection APIs as possible. |
@ufcpp |
"The idea is that for codegen-disallowed scenarios we’d run down a
different code path that’s still significantly faster than what’s otherwise
available using standard Reflection APIs."
Why not just implement this improved code path in the existing "compile
unavailable" expression tree path? This gives you an established rich API
that already covers everything cited in the example API, and would improve
the performance of a wide range of existing code including expression trees
emitted directly from the compiler via IQueryable-T. "Expression trees are
now much faster even on runtimes that don't allow compilation" would be a
great release note - much better than "a very few niche folks light make
use of a new and barely tested API".
|
If we had to choose between this and better support for compile time codegen (see Marc's linked thread), I would choose better compile time codegen support. |
Expression trees have a couple of issues when used for things like serializers that I think are worth improving.
My hope is that this API will provide performance reasonably close to compiled Expressions that's also consistently pretty good across runtimes. |
@GrabYourPitchforks I accidentally have done the second part of the proposal (about properties) while I was doing dotnet/corefx#36506. |
Moving to Future - not high enough priority at this time for the 5.0 schedule. |
namespace System.Reflection {
public delegate ref TField FieldAccessor<TTarget, TField>(ref TTarget target);
public static class ReflectionServices
{
public static FieldAccessor<TTarget, TField> CreateFieldAccessor<TTarget, TField>(FieldInfo fieldInfo);
}
} That's what I need! Currently it is impossible to read a struct field through reflection without boxing (even if I use Is this still being considered? |
@timcassell You can always use reflection.emit to make a field accessor delegate. It's just basically ldobj lfdfld ret. |
I do believe that is not available on AOT runtimes? |
If you're going AOT, honestly: "generators" would be a better option here.
Do the magic during build instead.
…On Fri, 18 Jun 2021, 07:51 Tim Cassell, ***@***.***> wrote:
@timcassell <https://github.com/timcassell> You can always use
reflection.emit to make a field accessor delegate. It's just basically
ldobj lfdfld ret.
I do believe that is not available on AOT runtimes?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#23716 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAEHMBZIROY3MNWGXAM3E3TTLUFJANCNFSM4OLSQUWQ>
.
|
I actually want it to read from a compiler-generated state machine for an optimization (yes I know that's an implementation detail and subject to change), so generators wouldn't work for me, either. |
@timcassell It's still being considered, it just got deprioritized compared to other work for this release. |
This issue has a lot of history and still-relevant side discussions, but I think we should close this since we've been actively changing reflection internals to make it faster, and soon will be adding new APIs to continue this. The new APIs are not based on "factory" patterns although in general they are "codegen-less" unless you count internally using IL emit as codegen. The goal for 8.0 is that existing users of IL.Emit, including internal implementations of System.Text.Json and DependencyInjection, can remove that code and use the built-in reflection APIs for fast constructors, invoke, and property\field access. For 7.0, we increased perf 3-5x of the existing object-based Invoke() APIs when IL Emit is available. We did not change Fields, however. NativeAOT also applied similar optimizations by improving thunks and applying the same internal invocation pattern based on byref-parameter spans. For 8.0 we plan on addressing fields and will add new APIs using byref-parameters through |
/cc @jkotas
Background
There are certain scenarios today - largely involving activation, serialization, and DI - where library authors perform codegen in order to perform operations on arbitrary types. The primary reason for this is performance. The standard Reflection APIs are too slow to be used in the code paths targeted by these library authors, and though codegen has a large upfront cost it performs considerably better when amortized over the lifetime of the application.
This approach generally works well, but the .NET Framework is considering scenarios where it must operate in environments which do not allow codegen. This renders ineffective the existing performance improvement techniques used by these library authors.
We are uniquely positioned to provide a set of APIs which can cover the majority of scenarios traditionally involving reflection-based codegen. The general idea is that library authors can rely on the APIs we provide to work correctly both in codegen-enabled and in codegen-disallowed environments. Alternatively, the library authors can detect at runtime whether codegen is enabled, and if so they can use their existing highly-optimized codegen logic, falling back to the new API surface if codegen is disallowed.
Sample API surface
Goals and non-goals
These APIs are not geared toward standard application developers who are already comfortable using the existing Reflection API surface. They are instead geared toward advanced library developers who need to perform Reflection operations in performance-sensitive code paths.
These APIs must work in a codegen-disallowed execution environment. (Are there exceptions?)
These APIs do not need to cover all scenarios currently allowed by the existing methods on
MethodInfo
and related types. For example, constructors that takeref
orout
parameters are sufficiently rare that we don't need to account for them. They can be invoked via the standard Reflection APIs.These APIs do not need to have the same observable behavior as using the Reflection APIs; e.g., we may determine that these APIs should not throw
TargetInvocationException
on failure. But these APIs must provide consistent behavior regardless of whether they're running within a codegen-enabled or a codegen-disallowed environment.Delegate creation does not need to be particularly optimized since there will be many checks performed upfront and we will ask callers to cache the returned delegate instances. However, once the delegates are created their invocation must be faster than calling the existing Reflection APIs. (Exception: if codegen is disallowed, then delegate invocation should be faster than calling the existing Reflection APIs wherever possible, and it must not be slower.)
It is an explicit goal to get serialization library authors to prefer this system over hand-rolling codegen for most member access scenarios. The selling points of this API would be ease of use (compared to hand-rolling codegen), performance, and the ability to work in a wide variety of execution environments.
It is an explicit non-goal to have performance characteristics equal to or better than a library's own custom codegen. For example, a DI system might choose to codegen a single method that both queries a service provider to get dependency instances and calls
newobj
on the target constructor. Such a system will always outperform these generalized APIs, but the API performance should be good enough that library authors would be generally satisfied using them over Reflection as a fallback in these scenarios.These APIs do not need to support custom implementations of
MemberInfo
. Only support for CLR-backed members is required.The text was updated successfully, but these errors were encountered: