Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: @generated structs #49187

Closed
MasonProtter opened this issue Mar 29, 2023 · 16 comments
Closed

Feature Request: @generated structs #49187

MasonProtter opened this issue Mar 29, 2023 · 16 comments

Comments

@MasonProtter
Copy link
Contributor

MasonProtter commented Mar 29, 2023

Warning: this is pretty speculative and surely quite hard to implement and we should have an issue for it.


I think it would be very useful if we had a (sparingly-used) way of generating a struct layout based on its type parameters. This would be analogous to the way we can currently generate a function body based on its type signature. I looked around and couldn't find a pre-existing issue, but this also isn't a very searchable topic.

Here are a few example things this could be used for:

Simpler SArray, and MArray which supports non-isbits types

Click me

Implementing these would be pretty much trivial if you had a @generated struct. Something like

@generated struct SArray{Size, T, N, IsMutable} <: AbstractArray{T, N}
    fields = map(1:prod(Size)) do i
        name = Symbol(:_, i)
        :($name :: $T)
    end
    Expr(:block, Expr(:mutable, IsMutable),  fields...)
end

This would make it so that if someone wrote say

SArray{(2, 2), String, 2, true}

that would represent something like

mutable struct SArray_2_2_String_2_true
    _1 :: String
    _2 :: String
    _3 :: String
    _4 :: String
end

And unlike the current implementation of MArray, this would support setfield! on the actual fields we care about letting us easily implement setindex!.

Even more powerfully, SArray could use this to decide that if say you gave it SArray{1000000000, Float64, 1, false}, that is way way too big to profitably store as an inline struct and instead heap allocate a vector or something to use as its storage.

Having a general way to melt, mutate, and then freeze an immutable struct

Click me
@generated struct Melted{T}
    fields = map(1:fieldcount(T)) do i
        name = fieldname(T, i)
        type = fieldtype(T, i)
        :($fieldname :: $fieldtype)
    end
    constructor = quote
        Melted(x::T) where {T} = new{T}($((:(getfield(x, $i) for i in 1:fieldcount(T))...))
    end
    Expr(:block, Expr(:ismutable, true), fields...,  constructor)
end

so that one could e.g. write

Melted{Complex{Float64}}

and get a struct equivalent to

mutable struct Melted_Complex_Float64
    re::Float64
    im::Float64
    MeltedComplexFloat64(x::ComplexFloat64) = new(getfield(x,1), getfield(x, 2))
end

People can then freely do things like e.g.

let m = Melted(big(1) // big(2))
    m.num = big(2)
    m.den = big(3)
    Rational(m)
end 

Note that this way we do not bypass the inner constructor of Rational.

Packages like Accessors.jl would not become unnecessary, but instead would have their scope reduced to intelligently
dealing with the properties and constructors of a type, and this would become an additional tool in their toolkit.

Forbidding specific values from a type signature

Click me

This is maybe too frivilous a use for what would likely be quite heavy machinery, but currently we have no way to put restrictions on the values in a type signature, and we have to rely on inner constructors to reject them. That is, I can write things like

Array{Float64, -1000.1}

and this is a perfectly valid type, it just will be rejected by all of its inner constructors. Having @generated structs though could allow one to forbid people from even representing an invalid value in a type signature just like how we currently can reject invalid types in a signature:

julia> struct Blah{T <: Integer} end
           

julia> Blah{String}
ERROR: TypeError: in Blah, in T, expected T<:Integer, got Type{String}

Compactified structs

Click me

Take for example Unityper.jl which takes in an expression like

@compactify begin
    @abstract struct Foo
        common_field::Int = 1
    end
    struct a <: Foo
        a::Bool = true
        b::String = "hi"
    end
    struct b <: Foo
        a::Int = 1
        b::Complex = 1 + im
    end
end;

and then compactifies these structs into one concrete struct with a minimal memory layout:

julia> dump(a())
Foo
  common_field: Int64 1
  ###a###2: Int64 1
  ###Any###3: String "hi"
  ###tag###4: var"###Foo###1" ₋₃₋₁₂₉Foo₋__a₋₃₋₁₉₉₂₋₋

julia> dump(b())
Foo
  common_field: Int64 1
  ###a###2: Int64 1
  ###Any###3: Complex{Int64}
    re: Int64 1
    im: Int64 1
  ###tag###4: var"###Foo###1" ₋₃₋₁₂₉Foo₋__b₋₃₋₁₉₉₂₋₋

julia> fieldtypes(Foo)
(Int64, Int64, Any, var"###Foo###1")

This works somewhat well, but cannot work currently if we wanted Foo to be a parametric type with parametric fields. With a @generated struct, we could generate a compactified layout precisely tailored to a set of parameters.


Just like @generated functions, this would be pretty heavy duty stuff that regular users shouldn't be doing, but I think it'd
allow authors of serious packages to do a lot of things that currently aren't possible (and in some cases, stop them from doing some worrying pointer shenanigans that doesn't generalize well), so I think having this feature should be an eventual goal.

@chriselrod
Copy link
Contributor

As another idea, C++ has partial template specialization, so you can define structs differently (different fields, methods, etc) based on template paramters.

E.g., you can have a generic template struct, but then you can make partial (or full) specializations with different fields, etc.

Not quite the freedom of the @generated Melted example, but also a useful idea.

@ToucheSir
Copy link

#8472 has some good discussion on this too.

@MasonProtter
Copy link
Contributor Author

Aha, yeah I figured there must be an older issue on this. If desired, we can close this issue as a duplicate and I can move what I wrote to a comment in that issue. I feel like a lot has changed since those days. I wonder if @timholy or @Jutho's perspectives have changed significantly since then?

@vtjnash
Copy link
Member

vtjnash commented Mar 30, 2023

There is also a package for this already (ComputedFieldTypes.jl)

@vtjnash vtjnash closed this as completed Mar 30, 2023
@MasonProtter
Copy link
Contributor Author

MasonProtter commented Mar 30, 2023

@vtjnash that computes field types, but not the number of fields or things like mutability (and at the expense of adding additional parameters). That package can't solve any of the use cases I listed above.

@KristofferC
Copy link
Member

There is also a package for this already (ComputedFieldTypes.jl)

@vtjnash You keep mentioning that package in similar issues to this but it just doesn't solve any problem people actually have. For example, being able to avoid the redundant last parameter (M*N) in a static matrix.

@vtjnash
Copy link
Member

vtjnash commented Mar 30, 2023

That is because it is not redundant. It stores the value that the runtime needed for the struct layout computation.

@ToucheSir
Copy link

to my knowledge ComputedFieldTypes also still requires struct definitions to be static. Whereas part of the motivation for this is that you could write transformed_type = generate_type(MyStruct).

@vtjnash
Copy link
Member

vtjnash commented Mar 30, 2023

Number of fields issues is solvable by that because Tuple exists and Tuple{} has size 0. The mutability aspects can be solved by wrapping the result in Ref and using something like Setfield.jl to make the updates a bit better (Theoretically, the only mutable type a language needs is a single type Ref, though that is lacking significantly in ease of use)

I am not saying this is solved fully, only that it is a duplicate issue.

@vtjnash
Copy link
Member

vtjnash commented Mar 30, 2023

FWIW, those constraints are also trivially satisfiable by a NamedTuple

mutable struct MutableNT{T}
    mutant::T
end
getproperty(x::MutableNT, s::Symbol) = getfield(getfield(x, :mutant), s)
setproperty!(x::MutableNT, s::Symbol, v) = setfield!(x, :mutant, merge(getfield(x, :mutant), NamedTuple{s}((v,))))

@ToucheSir
Copy link

ToucheSir commented Mar 30, 2023

Ref{Tuple} + Setfield (or the MutableNT wrapper) isn't quite equivalent because updates require deconstructing and reconstructing the internal tuple on every update. @MasonProtter and I played around with this: while the copying involved is usually minor, it does scale (in the wrong direction) with the number/size of fields in a way that mutable type field updates do not. So it remains only a partial solution.

@MasonProtter
Copy link
Contributor Author

That is because it is not redundant. It stores the value that the runtime needed for the struct layout computation.

Couldn't this be memoized in the compiler though rather than needing to actually be present in the type signature?

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Mar 30, 2023

As an example of what @ToucheSir means:

mutable struct MutableNT{T <: NamedTuple}
    mutant::T
end
Base.getproperty(x::MutableNT, s::Symbol) = getfield(getfield(x, :mutant), s)
Base.setproperty!(x::MutableNT, s::Symbol, v) = setfield!(x, :mutant, merge(getfield(x, :mutant), NamedTuple{(s,)}((v,))))

julia> let N = 10
           nt = NamedTuple{ntuple(i ->Symbol('a' + i - 1), N)}(ntuple(i -> i == 2 ? rand(Int) : rand(("hi", 1 + im, [1,2], Ref{Any}(1))), N))
           mnt = MutableNT(nt)
           @btime $mnt.b = 10
       end;
  2.820 ns (0 allocations: 0 bytes)

julia> let N = 20
           nt = NamedTuple{ntuple(i ->Symbol('a' + i - 1), N)}(ntuple(i -> i == 2 ? rand(Int) : rand(("hi", 1 + im, [1,2], Ref{Any}(1))), N))
           mnt = MutableNT(nt)
           @btime $mnt.b = 10
       end;
  4.820 ns (0 allocations: 0 bytes)

julia> let N = 100
           nt = NamedTuple{ntuple(i ->Symbol('a' + i - 1), N)}(ntuple(i -> i == 2 ? rand(Int) : rand(("hi", 1 + im, [1,2], Ref{Any}(1))), N))
           mnt = MutableNT(nt)
           @btime $mnt.b = 10
       end;
  23.032 ns (0 allocations: 0 bytes)

julia> let N = 200
           nt = NamedTuple{ntuple(i ->Symbol('a' + i - 1), N)}(ntuple(i -> i == 2 ? rand(Int) : rand(("hi", 1 + im, [1,2], Ref{Any}(1))), N))
           mnt = MutableNT(nt)
           @btime $mnt.b = 10
       end;
  44.869 ns (0 allocations: 0 bytes)

whereas with a custom mutable struct, we have

julia> eval(Expr(:struct, true, :(MNT200{$((Symbol(:T, i) for i in 1:200)...)}), Expr(:block, ((:($(Symbol('a' + i -1)) :: $(Symbol(:T,i))) for i in 1:200 ))...)))

julia> @generated function Base.NamedTuple(mnt::MNT200)
           Expr(:call, NamedTuple{fieldnames(MNT200)}, Expr(:tuple, (:(getfield(mnt, $i)) for i  1:200)...))
       end;

julia> @generated function MNT200(nt::NamedTuple)
           Expr(:call, MNT200, (:(nt[$i]) for i  1:200)...)
       end;

julia> let N = 200
           nt = NamedTuple{ntuple(i ->Symbol('a' + i - 1), N)}(ntuple(i -> i == 2 ? rand(Int) : rand(("hi", 1 + im, [1,2], Ref{Any}(1))), N))
           mnt = MNT200(nt)
           @btime $mnt.b = 10
       end;
  2.170 ns (0 allocations: 0 bytes)

I think though that Jameson's code is actually scaling better than some of the other approaches we tried (maybe this is good enough, what do you think Brian?) but it's still worse than a bespoke mutable struct.

@ToucheSir
Copy link

I think though that Jameson's code is actually scaling better than some of the other approaches we tried (maybe this is good enough, what do you think Brian?)

Good question. I imagine most structs are pretty small in practice, but I know e.g. SciML has some massive ones. For my case this should be good enough, but I imagine something like StaticArrays may have more stringent performance requirements.

@KristofferC
Copy link
Member

That is because it is not redundant. It stores the value that the runtime needed for the struct layout computation.

It's doable in C++ though. Maybe that is just a more powerful language though. ;)

@Jutho
Copy link
Contributor

Jutho commented Mar 30, 2023

Aha, yeah I figured there must be an older issue on this. If desired, we can close this issue as a duplicate and I can move what I wrote to a comment in that issue. I feel like a lot has changed since those days. I wonder if @timholy or @Jutho's perspectives have changed significantly since then?

I managed to mostly worked around this, by having the extra parameters in the struct, having a (@pure ?) function that computes the concrete parametric type from the basic parameters that define it, and by not specialising the fields in whatever struct that would have these "would-be" generated types as its fields, and then calling the aforementioned function in that type's constructor. Nonetheless, it would have been nice to have them … 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants