-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make precise semantics of world-age increments in top-level thunks #55145
Comments
Keno
added a commit
that referenced
this issue
Nov 9, 2024
This PR introduces a new, toplevel-only, syntax form `:worldinc` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As in example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup could occur in a different world age). The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any :worldinc statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:worldinc` exprs or after `:module` exprs. 2. The frontend inserts `:worldinc` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closes to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` eval(:(f() = 1)) f() ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Keno
added a commit
that referenced
this issue
Nov 10, 2024
This PR introduces a new, toplevel-only, syntax form `:worldinc` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As in example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup could occur in a different world age). The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any :worldinc statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:worldinc` exprs or after `:module` exprs. 2. The frontend inserts `:worldinc` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closes to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` eval(:(f() = 1)) f() ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Keno
added a commit
that referenced
this issue
Nov 14, 2024
This PR introduces uses the new, toplevel-only, syntax form `:latestworld` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As in example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup could occur in a different world age). The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any :worldinc statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:latestworld` exprs, after `:module` exprs, or at the beginning of the top-level exprs inside `:toplevel` and `:module`. 2. The frontend inserts `:latestworld` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closes to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` begin eval(:(f() = 1)) f() end ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Keno
added a commit
that referenced
this issue
Nov 19, 2024
This PR introduces uses the new, toplevel-only, syntax form `:latestworld` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As in example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup could occur in a different world age). The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any :worldinc statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:latestworld` exprs, after `:module` exprs, or at the beginning of the top-level exprs inside `:toplevel` and `:module`. 2. The frontend inserts `:latestworld` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closes to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` begin eval(:(f() = 1)) f() end ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Keno
added a commit
that referenced
this issue
Nov 20, 2024
This PR introduces uses the new, toplevel-only, syntax form `:latestworld` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As in example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup could occur in a different world age). The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any :worldinc statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:latestworld` exprs, after `:module` exprs, or at the beginning of the top-level exprs inside `:toplevel` and `:module`. 2. The frontend inserts `:latestworld` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closes to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` begin eval(:(f() = 1)) f() end ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Keno
added a commit
that referenced
this issue
Nov 21, 2024
This PR introduces a new, toplevel-only, syntax form `:worldinc` that semantically represents the effect of raising the current task's world age to the latest world for the remainder of the current toplevel evaluation (that context being an entry to `eval` or a module expression). For detailed motivation on why this is desirable, see #55145, which I won't repeat here, but the gist is that we never really defined when world-age increments and worse are inconsistent about it. This is something we need to figure out now, because the bindings partition work will make world age even more observable via bindings. Having created a mechanism for world age increments, the big question is one of policy, i.e. when should these world age increments be inserted. Several reasonable options exist: 1. After world-age affecting syntax constructs (as proprosed in #55145) 2. Option 1 + some reasonable additional cases that people rely on 3. Before any top level `call` expression 4. Before any expression at toplevel whatsover As an example, case, consider `a == a` at toplevel. Depending on the semantics that could either be the same as in local scope, or each of the four world age dependent lookups (three binding lookups, one method lookup) could (potentially) occur in a different world age. The general tradeoff here is between the risk of exposing the user to confusing world age errors and our ability to optimize top-level code (in general, any `:worldinc` statement will require us to fully pessimize or recompile all following code). This PR basically implements option 2 with the following semantics: 1. The interpreter explicit raises the world age only at `:worldinc` exprs or after `:module` exprs. 2. The frontend inserts `:worldinc` after all struct definitions, method definitions, `using` and `import. 3. The `@eval` macro inserts a worldinc following the call to `eval` if at toplevel 4. A literal (syntactic) call to `include` gains an implicit `worldinc`. Of these the fourth is probably the most questionable, but is necessary to make this non-breaking for most code patterns. Perhaps it would have been better to make `include` a macro from the beginning (esp because it already has semantics that look a little like reaching into the calling module), but that ship has sailed. Unfortunately, I don't see any good intermediate options between this PR and option #3 above. I think option #3 is closest to what we have right now, but if we were to choose it and actually fix the soundness issues, I expect that we would be destroying all performance of global-scope code. For this reason, I would like to try to make the version in this PR work, even if the semantics are a little ugly. The biggest pattern that this PR does not catch is: ``` eval(:(f() = 1)) f() ``` We could apply the same `include` special case to eval, but given the existence of `@eval` which allows addressing this at the macro level, I decided not to. We can decide which way we want to go on this based on what the package ecosystem looks like.
Addressed in #56509 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Recording some extended discussion with @vtjnash earlier today:
Right now, the semantics of world-age increment during top-level evaluation are ill-defined. It basically happens in the interpreter if codegen reaches here:
julia/src/interpreter.c
Line 469 in 0945b9d
That's mostly every statement, but not all top-level evaluation goes through this point, so there can be some suprises.
In codegen, world age increment happens at the beginning of any top-level evaluation and then there is a post-processing pass that increments the world age before any call instruction.
julia/src/codegen.cpp
Lines 8996 to 9001 in 0945b9d
julia/src/codegen.cpp
Lines 9646 to 9654 in 0945b9d
There are problems with this. The first is that inference does not model internal world age increments (and even if did, it would basically just end up giving up, because nothing is guaranteed anymore after a world-age increment). This has occasionally lead to complaints (#24316). We currently hack around this by syntactic detection of statements that affect world-age-partitioned global data structure and prohibiting inference of such statements. As noted in the issue this heuristic is incomplete as the existence of Core.eval implies that not all such modifications are syntactically visible (similar issues apply to concurrency).
A second issue is that #55032 has likely introduced divergent behavior between the interpreter and compiler with respect to observing world-age changes. This is probably not a huge problem, but would still break great to avoid.
At the same time, inference of top-level expressions can be valuable particularly when people run benchmarks or tests in toplevel scope, so we would like to preserve our ability to do so. Additionally, we want inference to always be correct, even when run on top-level code (see also e.g. #55144), so it is unsatisfying for inference to be unsound on some valid inputs, even if we never run it in practice.
We discussed various possible improvements. One well-liked solution was to increment world-ages only at the top of any
:toplevel
argument with corresponding changes in lowering to split definitions into more such blocks. The advantage of such a setup would be that individual top-level code blocks would always be legal to infer, with the world age effects well guarded byExpr(:toplevel)
. However, this solution appears unworkable because method and struct definitions are permitted access to local-scope variables that do not crossExpr(:toplevel)
boundaries.After some further discussion, we reached the following proposal:
a. Entry into toplevel evaluation (i.e. at the beginning of Core.eval)
b. After any evaluation of
:method
c. On the bindings branch, after any introduction of a new binding
There is some tricky interaction with the assignment semantics to global bindings on the bindings branch. In particular, we don't want global assignment to existing bindings to pessimize inference, but at the same time a binding partition replacement requires a world age increment. Since these operations are not fully syntactically separated, global assignment must gain conditional semantic world-age increments. The problem with this is that, as designed, whether an assignment (at toplevel) updates or replaces a global depends on the partition of said binding in the latest world age, which is not a property accessible to inference. A proposal here is to change the semantics to throw a ConcurrencyViolationError if the binding partition in the executing world is not the binding partition in the latest world. This is a little bit unsatisfying as assignment to outdated binding partitions is well defined in non-toplevel code, but possibly the best we can do. It should also not happen in practice unless there's very racy concurrenct evaluation going on.
The text was updated successfully, but these errors were encountered: