Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference: model partially initialized structs with PartialStruct #55297

Merged
merged 4 commits into from
Aug 20, 2024

Conversation

aviatesk
Copy link
Member

There is still room for improvement in the accuracy of getfield and isdefined for structs with uninitialized fields. This commit aims to enhance the accuracy of struct field defined-ness by propagating such struct as PartialStruct in cases where fields that might be uninitialized are confirmed to be defined. Specifically, the improvements are made in the following situations:

  1. when a :new expression receives arguments greater than the minimum number of initialized fields.
  2. when new information about the initialized fields of x can be obtained in the then branch of if isdefined(x, :f).

Combined with the existing optimizations, these improvements enable DCE in scenarios such as:

julia> @noinline broadcast_noescape1(a) = (broadcast(identity, a); nothing);

julia> @allocated broadcast_noescape1(Ref("x"))
16 # master
0  # this PR

One important point to note is that, as revealed in #48999, fields and globals can revert to undef during precompilation. This commit does not affect globals. Furthermore, even for fields, the refinements made by 1. and 2. are propagated along with data-flow, and field defined-ness information is only used when fields are confirmed to be initialized. Therefore, the same issues as #48999 will not occur by this commit.

@aviatesk aviatesk requested review from vtjnash and Keno July 29, 2024 08:50
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from b44f442 to 495c20b Compare July 29, 2024 15:41
Base automatically changed from avi/SROA-naming-refactor to master July 29, 2024 17:03
@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch 2 times, most recently from 2e208fc to 62da241 Compare July 30, 2024 07:14
ismod = sv isa Module
elseif isa(s00, PartialStruct)
sty = s00.typ
nflds = fieldcount_noerror(sty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to impose additional semantics on PartialStruct that are not clear to me are satisfied with the existing lattice elements. I'm thinking of cases like:

struct Foo
    x::Int
    y::Any
    Foo(x) = rand() ? new(1) : new(1, x)
end

It's not clear to me that the tmerge will respect the semantics you're imposing here. Perhaps it would be cleaner to give PartialStruct a new ninitialized field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The information equivalent to ninitialized can be derived from the length of fields, and when PartialStructs with different lengths of fields are joined, the joined PartialStructs are simply widened to the simple object type, so I don't see any problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that loses some precision, but I guess that's fine. That said, this needs a comment somewhere to describe the semantics of a PartialStruct with short fields. We should also audit all uses of fields and probably pkgeval this. Lastly, I'm not sure this logic is correct, because it doesn't look like it's actually looking at the length of fields, but I don't have the time to check right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, this needs a comment somewhere to describe the semantics of a PartialStruct with short fields

Does c57d34c seem to address this?

We should also audit all uses of fields and probably pkgeval this.

Yeah, I did audit all uses, but it would be a good idea to exercise with pkgeval.
@nanosoldier runtests()

I'm not sure this logic is correct, because it doesn't look like it's actually looking at the length of fields, but I don't have the time to check right now.

Specifically regarding this logic, the length of fields is later referenced within isdefined_tfunc. fieldcount_noerror is used for iteration when applying isdefined_tfunc to all fields that might be uninitialized. However, that code path is hit only when bounds checking turned off and the field is not constant, making it quite niche.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from 62da241 to c57d34c Compare July 30, 2024 16:06
@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk
Copy link
Member Author

aviatesk commented Aug 1, 2024

@nanosoldier runtests(["MCMCChains", "BlockArrays", "MCMCDiagnosticTools", "MLJTuning", "MLJIteration", "Hecke", "OutlierDetection", "MLJTSVDInterface", "SelfOrganizingMaps", "OutlierDetectionNeighbors", "OutlierDetectionNetworks", "MLJTestIntegration", "TMLE", "TaijaData"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from c57d34c to a595daf Compare August 2, 2024 07:52
@aviatesk
Copy link
Member Author

aviatesk commented Aug 2, 2024

@nanosoldier runtests(["MCMCChains", "BlockArrays", "MCMCDiagnosticTools", "MLJTuning", "MLJIteration", "Hecke", "OutlierDetection", "MLJTSVDInterface", "SelfOrganizingMaps", "OutlierDetectionNeighbors", "OutlierDetectionNetworks", "MLJTestIntegration", "TMLE", "TaijaData"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from a595daf to c0e388f Compare August 2, 2024 09:40
@aviatesk
Copy link
Member Author

aviatesk commented Aug 2, 2024

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch 2 times, most recently from 5a0b413 to 03fd9fc Compare August 13, 2024 18:13
@aviatesk
Copy link
Member Author

@nanosoldier runtests()

@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@nanosoldier
Copy link
Collaborator

Your job failed.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your job failed.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from 9ef7aac to 032510c Compare August 15, 2024 03:51
@aviatesk
Copy link
Member Author

@nanosoldier runtests(["PowerAnalytics", "NeptuneAILogger", "GAP", "VLBILikelihoods", "NeRCA", "LaurentPolynomials", "KrylovMethods", "MethodInspector", "BATTestCases", "UnicodePlots", "BloqadeExpr", "Combinat", "ValueShapes", "ModuleElts", "PermGroups", "RateLimiter", "MatInt", "FiniteFields", "FinitePosets", "PuiseuxPolynomials", "CycPols", "SignedPerms", "Supposition", "LazyReports", "Open62541", "InteractiveFixedEffectModels", "GLFixedEffectModels", "GalerkinToolkit", "Dynesty", "LazySets", "FSimBase", "PiecewiseDeterministicMarkovProcesses", "TuringCallbacks", "DifferentialEvolutionMCMC", "PlotRNA", "HydroPowerSimulations", "KdotP", "NonconvexBayesian", "ReactionNetworkImporters"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from 032510c to 500a380 Compare August 16, 2024 06:10
@aviatesk
Copy link
Member Author

@nanosoldier runtests(["PowerAnalytics", "NeptuneAILogger", "GAP", "VLBILikelihoods", "NeRCA", "LaurentPolynomials", "KrylovMethods", "MethodInspector", "BATTestCases", "UnicodePlots", "BloqadeExpr", "Combinat", "ValueShapes", "ModuleElts", "PermGroups", "RateLimiter", "MatInt", "FiniteFields", "FinitePosets", "PuiseuxPolynomials", "CycPols", "SignedPerms", "Supposition", "LazyReports", "Open62541", "InteractiveFixedEffectModels", "GLFixedEffectModels", "GalerkinToolkit", "Dynesty", "LazySets", "FSimBase", "PiecewiseDeterministicMarkovProcesses", "TuringCallbacks", "DifferentialEvolutionMCMC", "PlotRNA", "HydroPowerSimulations", "KdotP", "NonconvexBayesian", "ReactionNetworkImporters"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk
Copy link
Member Author

@nanosoldier runtests()
@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

There is still room for improvement in the accuracy of `getfield` and
`isdefined` for structs with uninitialized fields. This commit aims to
enhance the accuracy of struct field defined-ness by propagating such
struct as `PartialStruct` in cases where fields that might be
uninitialized are confirmed to be defined. Specifically, the
improvements are made in the following situations:
1. when a `:new` expression receives arguments greater than the minimum
   number of initialized fields.
2. when new information about the initialized fields of `x` can be
   obtained in the `then` branch of `if isdefined(x, :f)`.

Combined with the existing optimizations, these improvements enable DCE
in scenarios such as:
```julia
julia> @noinline broadcast_noescape1(a) = (broadcast(identity, a); nothing);

julia> @allocated broadcast_noescape1(Ref("x"))
16 # master
0  # this PR
```

One important point to note is that, as revealed in
#48999, fields and globals can revert to `undef` during
precompilation. This commit does not affect globals. Furthermore, even
for fields, the refinements made by 1. and 2. are propagated along with
data-flow, and field defined-ness information is only used when fields
are confirmed to be initialized. Therefore, the same issues as
#48999 will not occur by this commit.
@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from ff24278 to d22ecae Compare August 19, 2024 07:16
@aviatesk
Copy link
Member Author

@nanosoldier runtests(["SeparableOptimization", "NeptuneAILogger", "EulerAngles", "NLSolvers", "PiecewiseQuadratics", "VarianceComponentModels", "PGFPlots", "FSimBase", "MLJTSVDInterface", "TimeseriesFeatures", "OrdinaryDiffEqLowOrderRK", "NonconvexBayesian", "ReactionNetworkImporters", "MinimallyDisruptiveCurves", "BloqadeGates", "Pioran", "Groups"])

@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from d22ecae to d1b45b2 Compare August 19, 2024 10:31
@aviatesk aviatesk force-pushed the avi/partially-initialized-struct branch from d1b45b2 to 5f74ecc Compare August 19, 2024 10:59
@aviatesk
Copy link
Member Author

@nanosoldier runtests(["SeparableOptimization", "NeptuneAILogger", "EulerAngles", "NLSolvers", "PiecewiseQuadratics", "VarianceComponentModels", "PGFPlots", "FSimBase", "MLJTSVDInterface", "TimeseriesFeatures", "OrdinaryDiffEqLowOrderRK", "NonconvexBayesian", "ReactionNetworkImporters", "MinimallyDisruptiveCurves", "BloqadeGates", "Pioran", "Groups"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk
Copy link
Member Author

The test failure of MLJTSVDInterface.jl is not related to this PR. Going to merge.

@aviatesk aviatesk merged commit 9650510 into master Aug 20, 2024
8 checks passed
@aviatesk aviatesk deleted the avi/partially-initialized-struct branch August 20, 2024 02:58
aviatesk added a commit that referenced this pull request Aug 20, 2024
Following up #55297.
A mutable struct can have undefined fields in a non-contiguous manner,
but `PartialStruct` cannot model such a state. So in #55297
`PartialStruct` was used to represent only the mutable objects where the
field following the minimum initialized fields is also defined.

As a follow-up for the PR, this commit implements a minor improvement
that, in cases when the mutable object is already represented as a
`PartialStruct`, allows inference to add one more `isdefined`-field
information on top of those implied by its `fields`.

```julia
mutable struct PartiallyInitialized2
    a; b; c
    PartiallyInitialized2(a) = (@nospecialize; new(a))
    PartiallyInitialized2(a, b) = (@nospecialize; new(a, b))
    PartiallyInitialized2(a, b, c) = (@nospecialize; new(a, b, c))
end

@test Base.infer_effects((PartiallyInitialized2,); optimize=false) do x
    if isdefined(x, :b)
        if isdefined(x, :c)
            return x.c
        end
        return x.b
    end
    return nothing
end |> Core.Compiler.is_nothrow
```
aviatesk added a commit that referenced this pull request Aug 20, 2024
Following up #55297.
A mutable struct can have undefined fields in a non-contiguous manner,
but `PartialStruct` cannot model such a state. So in #55297
`PartialStruct` was used to represent only the mutable objects where the
field following the minimum initialized fields is also defined.

As a follow-up for the PR, this commit implements a minor improvement
that, in cases when the mutable object is already represented as a
`PartialStruct`, allows inference to add one more `isdefined`-field
information on top of those implied by its `fields`.

```julia
mutable struct PartiallyInitialized2
    a; b; c
    PartiallyInitialized2(a) = (@nospecialize; new(a))
    PartiallyInitialized2(a, b) = (@nospecialize; new(a, b))
    PartiallyInitialized2(a, b, c) = (@nospecialize; new(a, b, c))
end

@test Base.infer_effects((PartiallyInitialized2,); optimize=false) do x
    if isdefined(x, :b)
        if isdefined(x, :c)
            return x.c
        end
        return x.b
    end
    return nothing
end |> Core.Compiler.is_nothrow
```
aviatesk added a commit that referenced this pull request Aug 20, 2024
Following up #55297.
A mutable struct can have undefined fields in a non-contiguous manner,
but `PartialStruct` cannot model such a state. So in #55297
`PartialStruct` was used to represent only the mutable objects where the
field following the minimum initialized fields is also defined.

As a follow-up for the PR, this commit implements a minor improvement
that, in cases when the mutable object is already represented as a
`PartialStruct`, allows inference to add one more `isdefined`-field
information on top of those implied by its `fields`.

```julia
mutable struct PartiallyInitialized2
    a; b; c
    PartiallyInitialized2(a) = (@nospecialize; new(a))
    PartiallyInitialized2(a, b) = (@nospecialize; new(a, b))
    PartiallyInitialized2(a, b, c) = (@nospecialize; new(a, b, c))
end

@test Base.infer_effects((PartiallyInitialized2,); optimize=false) do x
    if isdefined(x, :b)
        if isdefined(x, :c)
            return x.c
        end
        return x.b
    end
    return nothing
end |> Core.Compiler.is_nothrow
```
aviatesk added a commit that referenced this pull request Aug 20, 2024
Following up #55297.
A mutable struct can have undefined fields in a non-contiguous manner,
but `PartialStruct` cannot model such a state. So in #55297
`PartialStruct` was used to represent only the mutable objects where the
field following the minimum initialized fields is also defined.

As a follow-up for the PR, this commit implements a minor improvement
that, in cases when the mutable object is already represented as a
`PartialStruct`, allows inference to add one more `isdefined`-field
information on top of those implied by its `fields`.

```julia
mutable struct PartiallyInitialized2
    a; b; c
    PartiallyInitialized2(a) = (@nospecialize; new(a))
    PartiallyInitialized2(a, b) = (@nospecialize; new(a, b))
    PartiallyInitialized2(a, b, c) = (@nospecialize; new(a, b, c))
end

@test Base.infer_effects((PartiallyInitialized2,); optimize=false) do x
    if isdefined(x, :b)
        if isdefined(x, :c)
            return x.c
        end
        return x.b
    end
    return nothing
end |> Core.Compiler.is_nothrow
```
aviatesk added a commit that referenced this pull request Aug 21, 2024
Following up #55297.
A mutable struct can have undefined fields in a non-contiguous manner,
but `PartialStruct` cannot model such a state. So in #55297
`PartialStruct` was used to represent only the mutable objects where the
field following the minimum initialized fields is also defined.

As a follow-up for the PR, this commit implements a minor improvement
that, in cases when the mutable object is already represented as a
`PartialStruct`, allows inference to add one more `isdefined`-field
information on top of those implied by its `fields`.

```julia
mutable struct PartiallyInitialized2
    a; b; c
    PartiallyInitialized2(a) = (@nospecialize; new(a))
    PartiallyInitialized2(a, b) = (@nospecialize; new(a, b))
    PartiallyInitialized2(a, b, c) = (@nospecialize; new(a, b, c))
end

@test Base.infer_effects((PartiallyInitialized2,); optimize=false) do x
    if isdefined(x, :b)
        if isdefined(x, :c)
            return x.c
        end
        return x.b
    end
    return nothing
end |> Core.Compiler.is_nothrow
```
KristofferC pushed a commit that referenced this pull request Sep 12, 2024
…55297)

There is still room for improvement in the accuracy of `getfield` and
`isdefined` for structs with uninitialized fields. This commit aims to
enhance the accuracy of struct field defined-ness by propagating such
struct as `PartialStruct` in cases where fields that might be
uninitialized are confirmed to be defined. Specifically, the
improvements are made in the following situations:
1. when a `:new` expression receives arguments greater than the minimum
number of initialized fields.
2. when new information about the initialized fields of `x` can be
obtained in the `then` branch of `if isdefined(x, :f)`.

Combined with the existing optimizations, these improvements enable DCE
in scenarios such as:
```julia
julia> @noinline broadcast_noescape1(a) = (broadcast(identity, a); nothing);

julia> @allocated broadcast_noescape1(Ref("x"))
16 # master
0  # this PR
```

One important point to note is that, as revealed in
#48999, fields and globals can revert to `undef` during
precompilation. This commit does not affect globals. Furthermore, even
for fields, the refinements made by 1. and 2. are propagated along with
data-flow, and field defined-ness information is only used when fields
are confirmed to be initialized. Therefore, the same issues as
#48999 will not occur by this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants