[mono] Add old design docs from mono website.

The original location of these documents is: https://github.com/mono/website/tree/gh-pages/docs/advanced/runtime/docs Changes made: - renamed BITCODE.md -> bitcode.md - removed some documents like xdebug.md which are no longer relevant
dotnet · Mar 14, 2024 · 5e992b9 · 5e992b9
1 parent b60a541
commit 5e992b9
Show file tree

Hide file tree

Showing 21 changed files with 3,627 additions and 0 deletions.
diff --git a/docs/design/mono/web/aot.md b/docs/design/mono/web/aot.md
diff --git a/docs/design/mono/web/ascii-strings.md b/docs/design/mono/web/ascii-strings.md
@@ -0,0 +1,137 @@
+---
+title: ASCII Mono
+---
+
+## Introduction
+
+This is a proposal for an optimisation to `System.String`.
+
+For historical reasons, `System.String` uses the UCS-2 character encoding, that is, UTF-16 without surrogate pairs.
+
+However, most strings in typical .NET applications consist solely of ASCII characters, leading to wasted space: half of the bytes in a string are likely to be null bytes!
+
+Since strings are immutable, we can scan the character data when the string is constructed, then dynamically select an encoding, thereby saving 50% of string memory in most cases.
+
+## Working Version
+
+A working version of this work is currently hosted here:
+
+[https://github.com/evincarofautumn/mono/commits/feature-strings](https://github.com/evincarofautumn/mono/commits/feature-strings)
+
+## Updating `String`
+
+Strings currently have the following representation:
+
+```csharp
+class String {
+    int length;
+    char firstChar;
+}
+```
+
+Where `&firstChar` is the starting address of the co-allocated string data. First we can observe that the `length` field is a *signed* 32-bit integer (`System.Int32`). Changing this to an *unsigned* integer (`System.UInt32`) gives us a free bit, which we can use to tell whether the string is normal (UCS-2) or *compact* (ASCII):
+
+```csharp
+class String {
+    uint taggedLength;
+    byte firstByte;
+}
+```
+
+Here, `(taggedLength & 1) == 0` indicates the non-compact encoding, for which `(char*)&firstByte` is the start of the UCS-2 character data; `(taggedLength & 1) == 1` indicates the compact encoding, for which the ASCII character data starts at `(byte*)&firstByte`.
+
+I use the low-order bit instead of the sign bit because it lets us get the length with a simple shift, regardless of encoding:
+
+```csharp
+public int Length {
+    get {
+        return (int)(taggedLength >> 1);
+    }
+}
+```
+
+## Getting There: Updating Native Code
+
+Many places in Mono unsafely access `String` data, but they can be updated fairly easily: we can rename the fields, and use accessors that assert that a particular encoding is in use. However, we must be careful to verify that all those paths are covered by the test suite.
+
+## Getting There: Disabling `fixed` on Strings
+
+The following is a technique that helped us bootstrap the effort.
+
+Every managed method that unsafely accesses `String` character data must be updated to account for whether the `String` is compact. This is tractable within `corlib`, but there is some third-party code that uses strings unsafely.
+
+The `fixed` statement on strings calls a method `get_OffsetToStringData`, which is used to adjust the `fixed` pointer to refer to the character data, rather than the `String` object. In ASCII Mono, we can make this method throw a `NotSupportedException` with a message like
+
+> Unsafe access to string data is not supported by this runtime.
+
+Now we’re sure that only `corlib`-internal methods can access the `String` data, because only those methods have access to the `firstByte` field.
+
+Once we have completed this auditing work, we are going to replace the `get_OffsetToStringData` with a method that duplicates
+any ASCII-strings into UTF-16 strings if the user happens to call fixed on a comapct string.
+
+## Getting there: Adding `UnsafeApply` API
+
+In order to update existing third-party code that uses strings unsafely, we need some kind of `UnsafeApply` API:
+
+```csharp
+public unsafe T UnsafeApply<T>
+    (Func<BytePtr, T> compact, Func<CharPtr, T> noncompact)
+```
+
+This accepts two callbacks, one for the case of the compact encoding, and one for the non-compact encoding. This isn’t ideal, because it’s neither safe nor particularly efficient (involving the allocation of delegates). But, on the bright side, that may discourage people from continuing to use unsafe code.
+
+## Adding `Iterator` API
+
+In order to simplify updating existing `corlib` code, we add a private `Iterator` API that allows iterating over `String` data regardless of encoding, so we can efficiently avoid duplicating the code for `char*` and `byte*`.
+
+The `String.Iterator` interface would provide methods such as:
+
+ * `Iterator Advance (int offset = 1)`
+ * `void CopyFrom (Iterator that, int count)`
+ * `long Difference (Iterator that)`
+ * `char Get (int index = 0)`
+ * `void Set (char value, int index = 0)`
+ * `int CharSize ()`
+ * `IntPtr Pointer ()`
+
+And have two concrete implementations, `CompactIterator` and `NonCompactIterator`, returned by a new `String` method `GetIterator` like so:
+
+```csharp
+private static unsafe Iterator GetIterator (IntPtr data, bool compact)
+{
+    if (compact)
+        return new CompactIterator (data);
+    return new NonCompactIterator (data);
+}
+```
+
+This requires the character data pointer be pinned from the outside. This ensures that it’s pinned for the lifetime of the iterator, and that only `corlib` can use this API.
+
+Phrasing the API in this way should let the JIT inline operations on concrete iterator types.
+
+## Updating `StringBuilder`
+
+`StringBuilder` is a linked list of mutable character arrays that can be frozen into a single `String` using the `ToString` method.
+
+We add an additional Boolean to each chunk, indicating whether it’s compact (the default) or non-compact. When inserting non-ASCII characters into an ASCII chunk, the chunk degrades to UCS-2.
+
+If all chunks of a `StringBuilder` are compact, as they are most of the time, then the result of `ToString` is compact.
+
+## Scanning Character Data
+
+At first blush it may seem very costly to scan every string. However, each string should only be scanned at most once, and the longer the string, the bigger the memory savings when it (probably) turns out to be compact-representable.
+
+Moreover, we can avoid scanning strings if we know ahead of time what the encoding should be; for example, concatenating two compact strings always yields a compact string.
+
+Scanning UCS-2 data for compact-representability is as simple as testing every character with the mask `(c & 0xFF80) == 0`, which is trivially unrollable and vectorizable. Likewise, we can scan UTF-8 data with the mask `(c & 0x80) == 0`.
+
+## Real-world Testing
+
+I’ve implemented a fairly stable prototype of this feature in Mono. It includes the stated changes to `String` and `StringBuilder`, as well as a fast vectorized scanner. It can build `corlib` and run the Mono and `corlib` test suites. With some effort, and patches to third-party libraries, it can run Xamarin Studio. For a large project using Roslyn code analysis, this leads to a ~10% savings in memory usage, with a small speed overhead.
+
+## Next Steps
+
+ * Deduplicate code by using the iterator API.
+ * Avoid allocating intermediate `char[]` arrays by using the iterator API.
+ * Upstream changes to third-party libraries.
+ * Get feedback and harden code for correctness, safety, and security.
diff --git a/docs/design/mono/web/atomics-memory-model.md b/docs/design/mono/web/atomics-memory-model.md
@@ -0,0 +1,133 @@
+---
+title: Atomics and Memory Model
+---
+
+## Introduction
+
+This document describes the semantics of atomic operations and the managed memory model in C#, CIL, and the BCL.
+
+The information here is based on the Ecma 334 and 335 specifications, MSDN documentation for the relevant BCL methods
+and equivalent Win32 functions, and the source code of CoreCLR and CoreFX.
+
+It is assumed that the reader understands basic concepts of memory models: Different memory barrier kinds, acquire and
+release semantics, the meaning of atomicity, and so on.
+
+The actual implementation of these operations in Mono is described at he end.
+
+## Semantics
+
+### Atomicity in the CLI
+
+Any load or store that is smaller than or equal to `IntPtr.Size` shall be atomic, but does not imply a barrier of any
+kind. Operations on 64-bit quantities are only atomic on 64-bit systems.
+
+The source/destionation address of a load/store operation must be properly aligned for the data type for the above
+guarantees to hold.
+
+If a load or store to an address happens at the same time as another load or store to that address but of a different
+size, all bets are off and no atomicity is guaranteed.
+
+These rules apply to high-level languages like C# and F# as they target the CLI.
+
+### `volatile.` prefix opcode in CIL
+
+When the `volatile.` prefix opcode is used in CIL, it imposes acquire/release semantics on the next non-prefix opcode.
+For loads, it results in acquire semantics. For stores, it results in release semantics.
+
+This prefix opcode has no effect on atomicity beyond the standard rules of the CLI.
+
+### `volatile` keyword in C\#
+
+The `volatile` keyword in C# compiles down to CIL loads and stores prefixed with the `volatile.` opcode.
+
+C#'s `volatile` cannot be applied to 64-bit quantities because regular loads and stores in CIL do not guarantee
+atomicity for 64-bit quantities on 32-bit systems, and the `Volatile` class did not exist when the `volatile` keyword
+was designed. Today, `volatile` on 64-bit quantities could conceivably be compiled down to `Volatile.Read` and
+`Volatile.Write` calls.
+
+### `Thread` class
+
+The `VolatileRead` and `VolatileWrite` methods perform loads and stores with acquire and release semantics,
+respectively. They guarantee absolutely nothing about atomicity beyond the standard rules of the CLI. In effect, this
+means that the 64-bit overloads of these methods are not atomic on 32-bit systems.
+
+There is a quirk in the .NET implementation where these methods actually use the `MemoryBarrier` method to insert a
+barrier. This is stronger than a simple acquire or release barrier. We do the same for compatibility.
+
+The MSDN documentation incorrectly states that the C# compiler emits calls to `VolatileRead` and `VolatileWrite` when
+using the `volatile` keyword.
+
+The `MemoryBarrier` method inserts a full sequential consistency barrier.
+
+### `Volatile` class
+
+The methods on the `Volatile` class are all atomic regardless of system bitness, and result in acquire and release
+barriers for loads and stores respectively.
+
+The 64-bit methods on this class are not atomic with respect to loads or stores made through other means than the
+methods on this class and the `Interlocked` class. This is because such 64-bit operations may need to be implemented
+with a lock on 32-bit systems.
+
+The MSDN documentation incorrectly states that the C# compiler emits calls to this class's methods when using the
+`volatile` keyword.
+
+### `Interlocked` class
+
+The methods on the `Interlocked` class are all atomic regardless of system bitness, and all have sequential consistency
+semantics.
+
+The 64-bit methods on this class are not atomic with respect to loads or stores made through other means than the
+methods on this class and the `Volatile` class. This is because such 64-bit operations may need to be implemented with a
+lock on 32-bit systems.
+
+The `MemoryBarrier` method is just an alias for `Thread.MemoryBarrier`.
+
+## Implementation
+
+### CLI rules
+
+When we see a CIL opcode prefixed with `volatile.`, we insert a `memory_barrier` IR opcode before or after the IR
+opcodes that make up the operation. This `memory_barrier` opcode is flagged with the appropriate barrier kind
+(`MONO_MEMORY_BARRIER_ACQ` or `MONO_MEMORY_BARRIER_REL`). `memory_barrier` opcodes are never reordered, and impose
+the necessary reordering restrictions on the surrounding IR opcodes as well.
+
+We expect all targets to support a `memory_barrier` opcode.
+
+### `Thread`, `Volatile`, and `Interlocked` methods
+
+The unoptimized behavior for these methods is to perform an icall into the runtime where they are implemented in C code
+usually through C compiler intrinsics, or in the case of the 64-bit `Volatile` and `Interlocked` methods on a 32-bit
+system, with a lock.
+
+We only use the icalls on targets where, for whatever reason, we can't replace calls to these methods with IR opcodes.
+
+### Intrinsics
+
+On most targets, we replace calls to the BCL methods with IR opcodes.
+
+#### `Thread` methods
+
+Calls to `MemoryBarrier` (and the alias on `Interlocked`) are replaced with the `memory_barrier` IR opcode with the
+`MONO_MEMORY_BARRIER_SEQ` kind.
+
+Calls to `VolatileRead` and `VolatileWrite` are replaced with regular `load*_membase` and `store*_membase` IR opcodes
+coupled with a `memory_barrier` IR opcode with either `MONO_MEMORY_BARRIER_ACQ` or `MONO_MEMORY_BARRIER_REL`.
+
+#### `Volatile` methods
+
+Calls to `Read` and `Write` are replaced with `atomic_load_*` and `atomic_store_*` IR opcodes flagged with
+`MONO_MEMORY_BARRIER_ACQ` or `MONO_MEMORY_BARRIER_REL`. These opcodes imply a memory barrier by themselves and as such
+cannot be reordered and impose reordering restrictions on surrounding opcodes, like the `memory_barrier` IR opcode.
+
+#### `Interlocked` methods
+
+Calls to `Read` are replaced with the `atomic_load_i8` IR opcode flagged with `MONO_MEMORY_BARRIER_SEQ`.
+
+Calls to `Increment` and `Decrement` are replaced with the `atomic_add_i4` and `atomic_add_i8` IR opcodes.
+
+Calls to `Exchange` are replaced with the `atomic_exchange_i4` and `atomic_exchange_i8` IR opcodes.
+
+Calls to `CompareExchange` are replaced with the `atomic_cas_i4` and `atomic_cas_i8` IR opcodes.
+
+The `atomic_add_*`, `atomic_exchange_*`, and `atomic_cas_*` IR opcodes all imply `MONO_MEMORY_BARRIER_SEQ` barriers
+(despite not explicitly being flagged) and behave as such in the IR with respect to reordering restrictions.