Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the memory model guaranteed by dotnet/runtime runtimes #63474

Closed
stephentoub opened this issue Jan 6, 2022 · 50 comments · Fixed by #75790
Closed

Document the memory model guaranteed by dotnet/runtime runtimes #63474

stephentoub opened this issue Jan 6, 2022 · 50 comments · Fixed by #75790
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI memory model issues associated with memory model
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Jan 6, 2022

We routinely run into questions about the coreclr / mono memory models and what model modern .NET code should be targeting.

ECMA specifies a memory model:
http://www.ecma-international.org/publications/standards/Ecma-335.htm
but it's mostly weaker than what's actually supported by the runtimes.

Joe Duffy wrote down a rough sketch of the model from the .NET Framework 2.0 days:
http://joeduffyblog.com/2007/11/10/clr-20-memory-model/
but that was a long time ago, and unofficial.

Igor Ostrovsky wrote two very nice articles about the memory model:
https://docs.microsoft.com/en-us/archive/msdn-magazine/2012/december/csharp-the-csharp-memory-model-in-theory-and-practice
https://docs.microsoft.com/en-us/archive/msdn-magazine/2013/january/csharp-the-csharp-memory-model-in-theory-and-practice-part-2
but that's also from a decade ago, and things have evolved.

We should:

  1. Document the model we want developers to actually code to; that includes code generated by the C# compiler (e.g. it currently doesn't use volatile when caching things like delegates for lambdas into fields). This could be in the Book of the Runtime (BOTR), on docs.microsoft.com, somewhere we can refer to as an official stance.
  2. Fix up code in the repo to respect that model, removing volatile where it's no longer necessary and adding it where it is to conform.
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 6, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Jan 6, 2022

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

Issue Details

We routinely run into questions about the coreclr / mono memory models and what model modern .NET code should be targeting.

ECMA specifies a memory model:
http://www.ecma-international.org/publications/standards/Ecma-335.htm
but it's mostly weaker than what's actually supported by the runtimes.

Joe Duffy wrote down a rough sketch of the model from the .NET Framework 2.0 days:
http://joeduffyblog.com/2007/11/10/clr-20-memory-model/
but that was a long time ago, and unofficial.

Igor Ostrovsky wrote two very nice articles about the memory model:
https://docs.microsoft.com/en-us/archive/msdn-magazine/2012/december/csharp-the-csharp-memory-model-in-theory-and-practice
https://docs.microsoft.com/en-us/archive/msdn-magazine/2013/january/csharp-the-csharp-memory-model-in-theory-and-practice-part-2
but that's also from a decade ago, and things have evolved.

We should:

  1. Document the model we want developers to actually code to; that includes code generated by the C# compiler (e.g. it currently doesn't use volatile when caching things like delegates for lambdas into fields). This could be in the Book of the Runtime (BOTR), on docs.microsoft.com, somewhere we can refer to as an official stance.
  2. Fix up code in the repo to respect that model, removing volatile where it's no longer necessary and adding it where it is to conform.
Author: stephentoub
Assignees: -
Labels:

area-Meta, untriaged

Milestone: -

@GSPP
Copy link

GSPP commented Jan 14, 2022

I just faced the case where I'm accessing the same memory mapped at two different addresses. It's a memory-mapped file based circular buffer. The implementation can be simplified by mapping the buffer two times back to back.

So I wonder what guarantees .NET makes in such a case.

@ericstj
Copy link
Member

ericstj commented Jan 14, 2022

@stephentoub @danmoseley do you have a suggestion for a better area path where this expertise lies?

@stephentoub
Copy link
Member Author

I'm not sure any is perfect, but area-CodeGen-coreclr is probably a good place to start.

@stephentoub stephentoub added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-Meta labels Jan 14, 2022
@ghost
Copy link

ghost commented Jan 14, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

We routinely run into questions about the coreclr / mono memory models and what model modern .NET code should be targeting.

ECMA specifies a memory model:
http://www.ecma-international.org/publications/standards/Ecma-335.htm
but it's mostly weaker than what's actually supported by the runtimes.

Joe Duffy wrote down a rough sketch of the model from the .NET Framework 2.0 days:
http://joeduffyblog.com/2007/11/10/clr-20-memory-model/
but that was a long time ago, and unofficial.

Igor Ostrovsky wrote two very nice articles about the memory model:
https://docs.microsoft.com/en-us/archive/msdn-magazine/2012/december/csharp-the-csharp-memory-model-in-theory-and-practice
https://docs.microsoft.com/en-us/archive/msdn-magazine/2013/january/csharp-the-csharp-memory-model-in-theory-and-practice-part-2
but that's also from a decade ago, and things have evolved.

We should:

  1. Document the model we want developers to actually code to; that includes code generated by the C# compiler (e.g. it currently doesn't use volatile when caching things like delegates for lambdas into fields). This could be in the Book of the Runtime (BOTR), on docs.microsoft.com, somewhere we can refer to as an official stance.
  2. Fix up code in the repo to respect that model, removing volatile where it's no longer necessary and adding it where it is to conform.
Author: stephentoub
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@GSPP
Copy link

GSPP commented Jan 17, 2022

  1. Is the "safe" event invocation eventField?.Invoke() actually thread-safe? ECMA says no, but I have seen this recommended in many places as being thread-safe. It's not safe under ECMA because the runtime is allowed to re-read a field.
  2. What guarantees are there in case of unsafe casting between reference types (e.g. array and RawArrayData)? See Document the behavior of the CLR when unsafe casts are used #6176
  3. When retyping arrays and spans of value types, is that guaranteed to be safe? I'm thinking of aliasing assumptions that the JIT might make. A byte[] can't refer to the same memory as an int[] (except of course using Unsafe.As, see 2). What about spans (Span<byte> to a Span<int>)? A quick test shows that .NET 5 assumes that such arrays won't alias but spans might.
  4. If the recommendation is going to be that not ECMA is the rulebook but some other memory model, what does this mean for other runtimes? Currently, a developer needs to target the intersection of all the runtimes he wants his code to run on.
  5. When working out a precise memory model (which seems to be the goal), internal runtime usages should be looked at. There should be a decision on what deviations from the memory model will be allowed in runtime internal code. For example, SZArrayHelper does some heavy recasting and I wonder what makes this safe. It seems, its safety is ensured by the test suite and by production use. It's not ensured by formal reasoning.

For (3), here's my test code:

[TestMethod]
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public void AliasArray()
{
    Debugger.Break();

    var ba = GetArray<byte>();
    var ia = GetArray<int>();

    var sum = 0;
    for (int i = 0; i < ba.Length; i++)
    {
        sum += ba[i];
        ia[i] = 1; //write, uncomment this to see difference
        sum += ba[i];
    }

    GC.KeepAlive(sum);
}

[TestMethod]
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public void AliasSpan()
{
    Debugger.Break();

    var ba = GetArray<byte>().AsSpan();
    var ia = GetArray<int>().AsSpan();

    var sum = 0;
    for (int i = 0; i < ba.Length; i++)
    {
        sum += ba[i];
        ia[i] = 1; //write, uncomment this to see difference
        sum += ba[i];
    }

    GC.KeepAlive(sum);
}

[MethodImpl(MethodImplOptions.NoOptimization)]
static T[] GetArray<T>() => new T[10];

@SingleAccretion
Copy link
Contributor

I can answer to the questions about aliasing.

What guarantees are there in case of unsafe casting between reference types (e.g. array and RawArrayData)?

None. The compiler assumes such aliasing will never ever happen and uses this for optimizations.

The compiler also assumes statics fields do not alias each other. This means that mutable overlapping RVA statics are not supported.

When retyping arrays and spans of value types, is that guaranteed to be safe? I'm thinking of aliasing assumptions that the JIT might make.

Just as above, the Jit assumes arrays of incompatible types will not alias each other (note it takes into account things like int[] <-> uint[]).

On the other hand, the compiler does not assume that byrefs or unmanaged pointers pointing to different types will not alias, so span reinterpretation is safe from the aliasing point of view.

Notably, there are compiler bugs that exist today which make the compiler assume that writes to "known" (derived from arrays or statics) locations will only be performed using "proper" fields.

@JulieLeeMSFT
Copy link
Member

cc @dotnet/jit-contrib .

@JulieLeeMSFT
Copy link
Member

cc @mangod9.

@omariom
Copy link
Contributor

omariom commented Feb 5, 2022

An example where having documented memory model would help.

Another example asking for clarity.

@GSPP
Copy link

GSPP commented Feb 11, 2022

Do synchronization primitives (locks, events, tasks, ...) generate a barrier? Normally, the answer is yes. So if you call Task.Result, does it cause a full barrier? You could think "yes". But what if the task is already completed? Then, AFAIK, this just generates a volatile load (an acquire). So what synchronization APIs generate what barriers? It should be spelt out.

@VSadov
Copy link
Member

VSadov commented Feb 23, 2022

I have a lot of context on this matter from various angles - from assumptions and guarantees in C# spec/compiler, to memory model invariants in the runtime (VM, GC, JIT), to how it translates to the native code (x64, arm, arm64).

I'd like to take this issue, if noone is already working on it.

@BruceForstall
Copy link
Member

@VSadov I think it's safe to say it's yours.

@BruceForstall BruceForstall added this to the Future milestone Feb 23, 2022
@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label Feb 23, 2022
@GSPP
Copy link

GSPP commented Mar 15, 2022

  1. How do exceptions interplay with the memory model?
  2. How do compilation relaxations and E-relaxed methods impact the guarantees being made?
    a. .NET supports the notion of relaxing the point where runtime exceptions are seen to occur. This can mean that, for example, an IndexOutOfRangeException is not observed at the point it is caused. This can change the writes to memory that actually occur before the exception (in sequential code). This claim might seem astonishing to some; it was to me, when I reported this as a codegen bug to Microsoft many years ago, to be told that it is by design.
    b. An example of this is described in ECMA (https://www.ecma-international.org/wp-content/uploads/ECMA-335_6th_edition_june_2012.pdf) at "VI.F.5.1 Hoisting checks out of a loop". Searching for "E-relax" or "CompilationRelaxations" delivers more results.
    c. There is an open request to expose this facility more than before: A mechanism to relax exception checks such that inlined method calls are at least as fast as manual inlining should be supported. #8948. But a similar feature has existed before on .NET Framework. I can no longer find it, but I have reproduced it back then. Unless it has been removed from .NET Core (I find no evidence of that) it might still exist in the JIT.
    d. A very approachable example of this is given in this comment: A mechanism to relax exception checks such that inlined method calls are at least as fast as manual inlining should be supported. #8948 (comment) (the hoisted null check on o.someField).
    e. This means that right now, in practical code, there are surprisingly reduced guarantees about the order and actual occurrence of memory writes. This rarely manifests but, as I said, I reproduced it many years ago.
    f. I wonder what memory effects are guaranteed if E-relaxation is enabled. ECMA says that effects will be the same if no exception is thrown. But will the concurrent order of effects be the same?
  3. What form will the new specification take? Will it be more of an informal document posted "somewhere", or will existing documents such as ECMA be refreshed?

@KalleOlaviNiemitalo
Copy link

Based on dotnet/csharpstandard#366 (comment), I think a new version of ECMA-335 is unlikely. I imagine the memory model documentation would be posted somewhere in https://github.com/dotnet/runtime/tree/main/docs/design.

@SingleAccretion
Copy link
Contributor

How do compilation relaxations and E-relaxed methods impact the guarantees being made?
... right now, in practical code, there are surprisingly reduced guarantees about the order and actual occurrence of memory writes. This rarely manifests but, as I said, I reproduced it many years ago.

Note: RyuJit does not reorder exceptions, any and all cases where it does are bugs. I would expect the reordering you observed on full framework to have been a bug as well; I was told once JIT64 especially did not take great care in preserving exceptions.

@GSPP
Copy link

GSPP commented Apr 1, 2022

How do volatile loads and stores interact with the hardware? This discussion came up in

ECMA has the opinion that volatile accesses can access hardware registers. This means that volatile loads cannot be dropped or coalesced (e.g. volatile int x; _ = x; _ = x;).

Apparently, the JIT does just that, though. And @VSadov just declared that this deviation from ECMA is acceptable.

If the .NET memory model is to be defined rigorously then this issue must be resolved.

@GSPP
Copy link

GSPP commented Apr 4, 2022

  1. How does StructLayout interact with the memory model? If we have a struct

    struct S { long L1; int I1; int I2; }

And we use Unsafe to re-type it as long, will this work? You could say that the first field is long, so "yes". But the runtime is free to reorder fields in Auto layout structs. The reordering could be different based on the containing type or stack configuration (in other words, different each time).

  1. Can I retype a struct S { Object O; } to a struct S2 { IntPtr I; } and write to it a reference to a pinned object? Can I write IntPtr.Zero? Can I read S2.I and print the address for debugging purposes? Can I use vector instructions to read and write references?
  2. Can I create an object reference to a piece of native memory that I have allocated and formatted as an object? ("Frozen segments"?)
  3. What happens if I read past the end of an object but I don't cross a page boundary (no page fault)? I believe, IndexOf does that.
  4. Can I read the object header in any way? Framework code does that.

@VSadov
Copy link
Member

VSadov commented Jun 20, 2022

In theory languages running on CLR can implement their own memory model. In practice languages typically specify only grammar and single-threaded semantics and leave memory model issues to the runtime.
C# in particular, rarely mentions threading in its documentation. Even the lock statement is specified in terms of translation to System.Threading.Monitor calls. C# does specify execution order and aliasing/atomicity of variable references.

The memory model for a language like C# mostly affects what optimizations may be performed, as memory model tells what would be observable. As a result it makes a lot of sense to just assume and provide the memory model of the underlying runtime.

For example consider:

x += 1;

Can we emit the code as

x = x + 1;  // evaluating x twice

or must do

ref var t = ref x;  // evaluating x once
t = t + 1;

The language spec says that

compound assignment expression of the form x op= y is equivalent to x = (T)(x op y), 
except that x is only evaluated once.

However the CLR memory model would tell that if x is accessed only by a single thread, then introducing reads and writes is allowed as long as overal result is the same. Therefore if compiler sees that x is a local variable, then it can use simpler x = x + 1 form. (it would need to use the other variant if x was a field in a class)

FWIW, for the purpose of providing examples in the CLR memory model doc I am going to use C#, assuming a mainstream compiler targeting CLR will preserve the memory model.

does a variable in C# always correspond to a memory location in CLR? Does a read/write/initialization in C# always correspond to a load/store in CLR? It seems that the optimizer may remove the variables and introduce new variables at its discretion, so it's not easy to reason in CLR terms while programming in pure C#.

Typically compiler will treat its code generation strategy as an implementation detail, so that not be constrained if changes need to be made. There are obvious concept leaks in areas like cross-language interop or when working with platform services like reflection.
I think C# compilers guarantee the publicly observed shape of classes/structs (i.e. public field of a public C# class must be a public field of a public CLI class), but I am not sure if it is formally documented.
@MadsTorgersen - if we can have more details here.

@VSadov
Copy link
Member

VSadov commented Jun 20, 2022

Another question: is there a canonical place to ask questions on the memory model and related issues?

It is just another feature in the runtime.

GitHub is an obvious place for bugs, feedback, documentation clarifications or discussions of in-progress items, but items here tend to have action oriented lifetime - Opened/Assigned/Closed...
I am not sure if there is a place specifically for questions.

@danmoseley
Copy link
Member

I am not sure if there is a place specifically for questions.

We have https://github.com/dotnet/runtime/discussions. Note that the labeler/notifications do not work there, so if a question gets overlooked, you may have to @-mention the right person.

@vladd
Copy link

vladd commented Jun 20, 2022

@VSadov I'm not sure if the code generation strategy can be seen as just implementation detail. An optimizing compiler can remove variables and rewrite the code completely, so the guarantees about the loads/stores into the memory locations on CLR level may have no counterpart on C# level.

Here is my favorite example: https://godbolt.org/z/YGGffzaab. The optimizing compiler folded the arithmetic series, removed the loop variable and the loop altogether, so there is no memory location corresponding to the loop variable at all. This example shows that C# optimizer, too, can omit creation of objects.

Keeping this in mind, I find it complicated to reason about the memory locations because I cannot control which of them are going to exist; as a developer, I can control only C# variables, but not whether the said variables would generate a memory location.

@EgorBo
Copy link
Member

EgorBo commented Jun 21, 2022

I can control only C# variables, but not whether the said variables would generate a memory location.

isn't when volatile enters the game?

@VSadov
Copy link
Member

VSadov commented Jun 21, 2022

I'm not sure if the code generation strategy can be seen as just implementation detail.

This is a choice of language designers and depends on their goals.
It is not even either-or choice. There could be some parts that are defined in terms of implementation (like lock is defined as a translation to System.Threading.Monitor calls) and some, like x += 1 can be defined in terms of sideeffects.

As another example specifying a C# local variable to be always an IL local would be inconvenient. Sometimes a local can be optimized away and sometimes it needs to be a field in a display class or struct if local's lifetime needs to exceed the lexical scope that created the local (happens in lambda or async capturing cases). Not specifying a particular implementation for locals is certainly useful.

I think that fields are always emitted as fileds, but can't find it in any formal spec.

@vladd
Copy link

vladd commented Jun 21, 2022

@VSadov I understand that the code generation is free to do whatever it wants with a variable, and that this freedom is advantageous for the better compiled code quality.

But then, as a developer I am having troubles reasoning in memory locations and loads/stores, because the actual existence of memory locations depends on the compiler's code generation strategy. What I would need is an ability to reason in terms of C# variables, and obtain memory model guarantees in terms of the variables.

(These guarantees might be not well-suited for runtime repository, though.)

@vladd
Copy link

vladd commented Jun 21, 2022

isn't when volatile enters the game?

Maybe volatile would guarantee that a C# variable corresponds to a memory location. I'm however interested in the non-volatile case too, as most of the variables are non-volatile.

My point is however that the developers usually need to think in variables, not in memory locations, and given that there is no direct mapping between them (runtime is free to elide the variables and introduce temporary variables or maybe reuse a memory location for different variables etc.) it would be advantageous to get the memory model expressed in terms of variables.

@jkotas
Copy link
Member

jkotas commented Jun 21, 2022

What I would need is an ability to reason in terms of C# variables, and obtain memory model guarantees in terms of the variables.

You can do this sort of reasoning only for methods such as Volatile.Read/Write or Interlocked.*, and volatile fields that are equivalent to Volatile.Read/Write. Otherwise, all ordinary variable are subject to be optimized out.

@vladd
Copy link

vladd commented Jun 21, 2022

@jkotas At least there is a guarantee that an immutable object doesn't need any Volatile.*/Interlocked.* operations to be correctly published, right? I mean, we don't need an explicit synchronization if an immutable object is read from another threads, no matter if variables get their memory locations or optimized out. This is (hopefully) implied by Object assignment chapter.

So as a developer I don't need to think in memory locations at least in this case. This is a kind of guarantees I'm talking about.

(Let me reiterate: I'm not sure if the language-level guarantees are on-topic in this repository. I'm sorry for offtopic they are not.)

@VSadov
Copy link
Member

VSadov commented Jun 21, 2022

As far as I know C# makes the same assumptions about observability of accesses to variables that can be shared with other threads. Optimizations are only performed when they are unobservable. See example in #63474 (comment)

Either a variable is not observable from multiple threads, and then it can follow just single-threaded semantics, or it must be a field and it must follow the memory model that is basically the same as CLR memory model.

Is this enough of a "connection" that would allow applying CLR memory model to C# variables?

@VSadov
Copy link
Member

VSadov commented Jun 21, 2022

I mean, we don't need an explicit synchronization if an immutable object is read from another threads, no matter if variables get their memory locations or optimized out. This is (hopefully) implied by Object assignment chapter.

Yes, as long as object is initialized before publishing (in program order).
C# compiler will not reorder field modifications past the publishing of the object, and then the "publish" will be a release.

This is a very common pattern when a shared instance is lazily initialized by threads that see the instance is null. The pattern does not need synchronization if identity of the instance is unimportant. In a case of a race you may have >1 instances created and assigned, but they all will be well-formed when used by a different thread.

@vladd
Copy link

vladd commented Jun 21, 2022

Either a variable is not observable from multiple threads, and then it can follow just single-threaded semantics, or it must be a field and it must follow the memory model that is basically the same as CLR memory model.

Is this enough of a "connection" that would allow applying CLR memory model to C# variables?

Thank you for this valuable addition! I think this should be enough for connecting C# to the CLR memory model, but I'd prefer to hear an opinion of language specialists more experienced then me.

Can I reformulate your point this way:

  1. If a C# variable cannot be accessed from different threads, the optimizer is totally free to do any optimizations which preserve the single-threaded semantics (see Execution order).
  2. If a C# variable can potentially be accessed from several threads (e. g. it's a field or a local in an async method or in an iterator), then this variable is necessarily lowered as a field and corresponds to a memory location, so the guarantees from CLR memory model apply
    • In particular this means that in async methods, any temporary variables crossing the await boundary cannot be elided, right?

?

@VSadov
Copy link
Member

VSadov commented Jun 22, 2022

Can I reformulate your point this way:

  1. If a C# variable cannot be accessed from different threads, the optimizer is totally free to do any optimizations which preserve the single-threaded semantics (see Execution order).
  2. If a C# variable can potentially be accessed from several threads (e. g. it's a field or a local in an async method or in an iterator), then this variable is necessarily lowered as a field and corresponds to a memory location, so the guarantees from CLR memory model apply
    • In particular this means that in async methods, any temporary variables crossing the await boundary cannot be elided, right?

Yes.
Except that lambda capture would be a better example than iterator/async.

yield/await do not imply sharing local variables with another thread.
It is possible to see a different OS thread running the continuation after execution is resumed, when identity of the thread is unimportant, but there is only one thread accessing the locals in async method at a given time, so compiler will treat accesses as thread-local.

@vladd
Copy link

vladd commented Jun 22, 2022

yield/await do not imply sharing local variables with another thread. It is possible to see a different OS thread running the continuation after execution is resumed, when identity of the thread is unimportant, but there is only one thread accessing the locals in async method at a given time, so compiler will treat accesses as thread-local.

@VSadov That's an interesting topic. Is there really a guarantee that enumeration won't happen on different threads? I mean something like this:

var en = seq.GetEnumerator();

(new Thread(Enumerate)).Start();
(new Thread(Enumerate)).Start();

void Enumerate()
{
    while (en.MoveNext())
        Console.WriteLine(en.Current);
}

With this code, MoveNext() will be most probably executed simultaneously on different threads. While the outcome might be undefined on the language level (is it? what does the language standard say about this?), the compiler needs to ensure at least some synchronization in order to keep the basic guarantees, right?

@jkotas
Copy link
Member

jkotas commented Sep 17, 2022

#75790

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Dec 16, 2022
@JulieLeeMSFT JulieLeeMSFT modified the milestones: Future, 8.0.0 Dec 29, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jan 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI memory model issues associated with memory model
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.