Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflection free code gen #12960

Merged
merged 24 commits into from
Aug 22, 2022
Merged

Reflection free code gen #12960

merged 24 commits into from
Aug 22, 2022

Conversation

dsyme
Copy link
Contributor

@dsyme dsyme commented Apr 8, 2022

There is a strong desire in the .NET and F# community to set up .NET and F# to be more friendly to native compilation tool chains and "tree shaking" compilation.

See also #12819 and #11891

Historically this has been a tricky area for both F# and .NET generally, some of the history is mentioned here: fsharp/fslang-suggestions#919. Looking forward, there are recent strong efforts in this area led by @jkotas and others.

Now, this is as an area where "small but deadly" things can kill you - one generated construct that a native toolchain can't handle. or one reference through to some large chunk of library code that gets linked in. Problems we definitely know of are

  1. F# currently implicitly emits implementations of ToString() for records, unions and structs (also get_Message() for exception definitions) and these in turn use sprintf with %A formatting. F# also emits DebuggerDisplayAttribute for union types that do likewise.

  2. Users may use %A formatting

  3. Users may use quotations (either implicitly or explicitly)

  4. The user may end up using libraries that use these constructs.

The proposal in fsharp/fslang-suggestions#919 is to address these by add a command line option --reflectionfree that adjusts F# code generation to both avoid the generation of the problematic constructs, and to give an error if %A is used.

Note that (1) is not a problem in FSharp.Core as the code generation is already explicitly suppressed in that case

(2) is potentially a problem - this PR removes a few explicit uses of %A in FSharp.Core

(3) is probably better dealt with by the user simply avoiding quotations, but not banning their use outright.

Note that using this option would affect both the ToString semantics (no implicit implementation is given) and the debugging experience

  • try it
  • check the uses of reflection in FSharp.Core where we look at attributes of types to determine equality semantics
  • check uses of reflection in FSharp.Core related to dynamic implementations of some operators.

@kerams
Copy link
Contributor

kerams commented Apr 8, 2022

Note that using this option would affect both the ToString semantics (no implicit implementation is given)

Could this effort be paired with emitting specialized ToString methods and fsharp/fslang-suggestions#1108? Or one step at a time? :)

@dsyme
Copy link
Contributor Author

dsyme commented Apr 8, 2022

@jkotas writes

We have a section on trimming and NativeAOT in our regular preview announcement blog post. I think we should mention this is a good example of how .NET ecosystem is approaching trimming and AOT compatibility.

We depend heavily on the IL static analysis and annotations to detect reflection patterns that may break with AOT or linker. It is documented in https://docs.microsoft.com/en-us/dotnet/core/deploying/trimming/prepare-libraries-for-trimming . We have the static analysis implemented both as Roslyn analyzer (not applicable for F#) and as part of trimmer/AOT compiler. You should be able to verify that you got all problematic reflection patterns covered by getting zero static analysis warnings when compiling sample F# apps.

@vzarytovskii
Copy link
Member

Note that using this option would affect both the ToString semantics (no implicit implementation is given)

Could this effort be paired with emitting specialized ToString methods and fsharp/fslang-suggestions#1108? Or one step at a time? :)

I'd personally keep it separate features/contributions.

@agocke
Copy link
Member

agocke commented Apr 8, 2022

This is really cool! If you want to verify that your app is safe for AOT and trimming, the linker can produce trim warnings whenever there is code that is not statically understood. Instructions are available at https://docs.microsoft.com/en-us/dotnet/core/deploying/trimming/trim-self-contained

Looking forward, I think the biggest trimming concern that I would have in F# is serialization. Unfortunately, most serialization patterns are simply too complex to statically annotate. The recommended substitution is to use a source-generated serializer, like the JSON source generators or my serde-dn library. Unfortunately, those are reliant on C# source generators in their implementation (serde-dn works by having the source generator implement the serialization interfaces automatically), so some way of providing similar functionality in F# would be very interesting.

@baronfel
Copy link
Member

baronfel commented Apr 8, 2022

The clearest parallel for F# to source generators IMO are generative Type Providers, which are compiler-hosted components that emit code mid-compilation. There are some rough edges around the UX for them, and debugging the generated code as you are writing one of them can be a little gnarly, but the overall mechanism is pretty well understood.

@agocke
Copy link
Member

agocke commented Apr 8, 2022

Yup, I'll just speak for myself here since I don't know exactly what the flexibility around JSON source generation is: serde-dn operates fundamentally around a series of interfaces that ensure type safety of serialization, ease of customization, and flexibility of format. In general, the preferred pattern for types that appear in source is to implement the appropriate interfaces on the types directly. There are always cases where you can't implement an interface directly (like on a type you don't own) and there is a system for accommodating that (generating wrapper structs), but the model really prefers implementing interfaces directly when possible.

My understanding is that type providers wouldn't provide that capability, but I might be out of date here.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 10, 2022

Looking forward, I think the biggest trimming concern that I would have in F# is serialization.

Sort of.

One important thing to understand is how FSharp.Data achieves strong typed data access (including serialization/de-serialization) without using reflection or code generation. This quite counter-intuitive because the erasing/façade techniques are not possible in C#. so it can be hard to grok from the C# perspective (or indeed any language that doesn't support either erased types or type providers).

In detail - when FSharp.Data is pointed at an external CSV, JSON or XML schema it is at entirely reflection-free at runtime. It also does not really do source generation. How does this work?

  • At compile time, the schema is used to provide a façade of erased types. These give the appearance and effect of strong typing to F#.

  • At runtime these are erased to heterogeneous underlying objects, e.g. property bags like JsonValue.

  • At compile time, each individual strongly typed access to the provided object model (e.g. foo.Name) is erased to become an equivalent access on the underlying representation (e.g. foo.GetJsonValue("name") or foo.["name"] or whatever)

  • At runtime there is no reflective access on (de-)serialization. Instead (de-)serialization produces/consumes the heterogeneous representation, eg. JsonValue. The façade types completely disappear at runtime.

The equivalent C# feature would be source generators with a class erased XYZ : Representation { ... } where XYZ is reduced to Representation at compile time and disappears at runtime.

Erasing type providers have some issues (no runtime types, have to be careful across multiple assemblies), but being reflection-free-at-runtime and tree-shaker friendly is not intrinsically one of them. It will still be important to make sure that FSharp.Data is really tree-shaker friendly in practice, and this will need to be put under test. The main practical problem is that these require currently an explicit schema (normally via a sample) and so can't easily be used for scenarios where schema is being implied by the types defined in the current assembly (see fsharp/fslang-suggestions#212)

Note erasing isn't critical in achieving the above - a generative type provider can do the same - but it's harder work because the code to convert from the heterogeneous representation to the type-specific one must be generated - this is a nop for erased provided types.

To summarise, I think people mistakenly believe that FSharp.Data is doing reflection at runtime - it isn't.

Anyway, realistically I'd expect it to play out like this:

  • Short term:

    1. Most F# users who care about trimming down (relatively rare) will use FSharp.Data. Any issues with that approach get ironed out, and perhaps new data formats are added,
    2. Some other type providers like SwaggerProvider may be rewritten to remove reflection on (de-)serialization.
    3. A small number develop F#-specific code generation for serializers, e.g. Myriad.
    4. A small number write hand-written serialization code using combinators, which is pretty pleasant in F# and is a good way of "owning your own destiny"
    5. A small number use C# source generators in C# projects over a C# object model.
  • Medium term:

    1. The above techniques mature
    2. F# type providers get adapted to allow them to work over types defined in the current assembly, which is approved in principle but a substantial ticket. We'd then expect F#-specific libraries like FSharp.Data to be extended to handle deriving schema from type definitions.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 10, 2022

...generative type providers...

TBH I think erased type provides are also pretty feasible - see notes above on FSharp.Data today.

Co-authored-by: Ilja Nosik <ilja.nosik@outlook.com>
@agocke
Copy link
Member

agocke commented Apr 11, 2022

@dsyme Very cool. Yup, FSharp.Data looks like it should work just fine with trimming.

@dsyme
Copy link
Contributor Author

dsyme commented Jun 14, 2022

Linking this: fsharp/fslang-suggestions#919 (comment)

@vzarytovskii vzarytovskii added this to the August-2022 milestone Aug 2, 2022
@0101 0101 self-assigned this Aug 4, 2022
@dsyme
Copy link
Contributor Author

dsyme commented Aug 16, 2022

@0101 asked what next steps should be here

I think the first thing to determine is "does it matter if some things in FSharp.Core use reflection, and what should we do about it"?

Specific things

  • Is ty.GetCustomAttributes(typeof<CompilationMappingAttribute>, false) etc. used in FSharp.Core considered "reflection-free" for the purposes of trimming, .NET native etc. I'm hopeful that this is not the case and that basic reflection on the custom attributes of .NET types is permitted - that this doesn't count as "unreferenced code". If not we have a significant problem because the semantics of F# generic equality rely on being able to do this kind of reflection (that is, we need to know a few bits of information about each type definition we encounter). It's not totally the end-of-the-world if it is a problem but it would mean that trimmed code could change semantics in subtle ways.

  • Should we be sprinkling RequiresUnreferencedCodeAttribute over the things that use reflection constructs in a way that kills trimming, including

    • Printf.printfn and friends

    • ExtraTopLevelOperators.printfn and friends

    • FSharp.Reflection.*

    • FSharp.Quotations.*

    • FSharp.Linq.*

    • ExtraTopLevelOperators.query

  • Do the FSharp.Core entry points leading to UnaryDynamicImpl and BinaryDynamicImpl get trimmed out if they're never actually needed? These are used for the "dynamic" implementations of FSharp.Core.Operators.(+) but because of inlining we don't actually emit calls to these in F# code except when generating the further dynamic implementations of generic inlined code that calls these, e.g. let inline f x = x + x generates a call to FSharp.Core.Operators.(+) in the entry point we emit - but in turn that is only actually ever used if you reflect on f or take a quotation involving f.

  • Should we be putting RequiresUnreferencedCodeAttribute on FSharp.Core.Operators.(+) and friends for use by analysis tooling - again, the "usual" use of these operators in F# gets inlined and flattened, and the attributes would only be there for the reflective code emitted in codegen.

  • Is Roslyn in any way aware of RequiresUnreferencedCodeAttribute or any of this stuff

  • What analyzer support does C# by default for all this stuff. We don't have to implement it, but it would be good to know what we're missing

  • I did see some related issues where F# folk are starting to mark stuff as trimmable:

    This makes me wonder should we be marking FSharp.Core as trimmable in the same way?

Overall it would be great to get community and expert advice on this, and also get an understanding for what will go wrong if it doesn't work.

@vzarytovskii
Copy link
Member

I guess there are several parts of this - reflection-free codegen by compiler itself ("%A", quotations, dynamic imlpementations, etc), reducing/reviewing reflection in FSharp.Core and making FSharp.Core adjustments for easier trimmability.

@dsyme
Copy link
Contributor Author

dsyme commented Aug 16, 2022

I guess there are several parts of this - reflection-free codegen by compiler itself ("%A", quotations, dynamic imlpementations, etc), reducing/reviewing reflection in FSharp.Core and making FSharp.Core adjustments for easier trimmability.

Yes - and note we can also bank this PR and move on to FSharp.Core and actual trim testing next - I think we're satisfied with it?

@dsyme dsyme changed the title [WIP] Reflection free code gen Reflection free code gen Aug 16, 2022
@jkotas
Copy link
Member

jkotas commented Aug 16, 2022

Is ty.GetCustomAttributes(typeof, false) etc. used in FSharp.Core considered "reflection-free" for the purposes of trimming,

Reading custom attributes is compatible with trimming and AOT compilation.

(I would not call it "reflection-free". Custom attributes are reflection by design.)

Should we be sprinkling RequiresUnreferencedCodeAttribute over the things that use reflection constructs in a way that kills trimming,

RequiresUnreferencedCodeAttribute allows static analyzers to provide better diagnostic messages.

For example, if you do not sprikle RequiresUnreferencedCodeAttribute over Printf.printfn and somebody tries to trim the application that calls Printf.printfn, they will see warnings about trim unfriendly APIs being called somewhere inside FSharp.Core implementation, it won't be straightforward for them to figure out that these warnings are caused by a call to Printf.printfn.

If you sprinkle RequiresUnreferencedCodeAttribute over Printf.printfn and somebody tries to trim the application that calls Printf.printfn, they will see warning about the Printf.printfn, making the problem much more obvious.

Is Roslyn in any way aware of RequiresUnreferencedCodeAttribute

Roslyn itself is not aware of RequiresUnreferencedCodeAttribute.

What analyzer support does C# by default for all this stuff.

We have Roslyn analyzer that checks RequiresUnreferencedCodeAttribute and friends in the code that you are compiling from source. This analyzer is good for dev inner loop. You would have to reimplement this analyzer for F#.

The IL linker and AOT compiler have same analyzer that works on the IL of the whole app. You will get this analyzer for free for F#.

@dsyme
Copy link
Contributor Author

dsyme commented Aug 16, 2022

@jkotas Thank you!!

It makes me curious if some of FSharp.Reflection.FSharpType and FSharp.Reflection.FSharpValue will also work.

@jkotas Is MakeGenericType considered unreferenced code in all circumstances?

@vzarytovskii
Copy link
Member

I'd say we should merge this first set of changes and then continue working on corelib from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compiler-CodeGen IlxGen, ilwrite and things at the backend Needs-RFC
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

8 participants