Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Safe and performant access to Memory Mapped Files #57330

Open
ChristophTF opened this issue Aug 13, 2021 · 4 comments
Open

[Proposal] Safe and performant access to Memory Mapped Files #57330

ChristophTF opened this issue Aug 13, 2021 · 4 comments
Labels
area-System.IO needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@ChristophTF
Copy link

Disclaimer

This coarse proposal is maybe a bit long and not fitted for this place. Please move this issue if deemed necessary.
It can be seen as a coherent collection of ideas regarding safe access to MemoryMappedFiles and maybe similar unmanaged buffers to.
The envisioned ideas and proposals may very well be perceived as too special in their area of application, and/or too fundamentally changing to be considered useful in the context of the .NET runtime.
I'm inviting to some discussion, whether you have thought of / implemented something similar in the past, have collected valuable experience on such an approach, e.g. why it's deemed to fail, or may find it quite interesting for a similar scenario not mentioned here.

Background and Motivation

General Motivation

Since the introduction of .NET Core, more effort has been invested towards making the C#/.NET ecosystem suitable for scenarios where performance matters.
Especially with the design of Span<T> and its subsequent big march through the APIs all over the .NET standard library, it has been explored to which extent performance-beneficial low-level access can be enabled while uncompromisingly maintaining the safety promise of the CLR.

When dealing with MemoryMappedFiles, this movement yet came to halt on every encounter. Examples are #37227 and #24767

Concrete Background/Example

As a working student, i had been reworking a library for reading a particular format of measurement files, regularly encountered in the automotive industry. It's constituted of a rather archaic system of blocks, linking to each other and containing fixed fields and flexibly sized data pools. Combine this with big file sizes (in the order of 100MBs to GBs), where often only some parts of the file are necessary to gather the needed information, or only some parts of a long series of measured samples are interesting for fulfilling a query.
In such scenario, memory mapped files provide for an excellent combination of

  • Nicely caching read data, thus evading frequent system calls
  • Not putting memory pressure on the system (e.g. compared to when reading into larger byte arrays)

Since the entire tool ecosystem at my workplace using C# and it being a very comfortable language, it was natural to stay with it. Furthermore, solutions including native interop distinctly increase the workload and don't perform too good, at least when considering a very talkative API.

At this point it was clear that i was going to use memory mapping, with all it's safety consequences.
One therefore has to decide how to manage the memory unsafety and hide it to the library consumer. The file object could be disposed and such the mapping handle closed by the user at any point in time. Note that not every implementation of memory mappings has made this design choice (see Javas MappedByteBuffer, which keeps the file opened until GC kicks in), but not doing so sensitively hits the user experience, having to deal with in-use files for an undeterministical duration of time.
To avoid Access Violations, my code would have to check and hold a lock at least at every entrance to the library. This is both related to tedious work and dumb errors when forgetting this procedure someplace, and seriously hits performance when applied together with a talkative API. Imagine e.g. an indexable wrapper over the values of a particular measurement signal, acquiring a lock with ~20ns overhead for each access, when the operation itself would only take a fraction of this time.

Instead, i got inspired by the java example and thougt of some address space reserving scheme. (see illustration below)

address-reservation-scheme

We can aim to create a file mapping in the constructor of a custom file-mapping object. On disposal, we replace it with a mere reservation, that ensures no other allocation ends up there. This reservation can then be kept until the GC finally collects the file-mapping object. This scheme naturally induces a higher usage of address space. But on current 64-bit systems, we usually have plenty of it and thus it's not really a concern anymore. Even though my Intel processor internally only can utilize 48-bit, and my current version of Win10 only allows allocation up to 47-bit of address space, that are still many TBs of address space. A test run that allocated address space with this scheme in a loop (keeping track of the managed objects to not have the finalizer deallocate the regions) needed quite some time until it hit the ceiling.

This is made possible by the newer Win32 placeholder API (see MapViewOfFile3 and MEM_REPLACE_PLACEHOLDER), and AFAIK (but not tried) much longer on linux systems (see mmap(2) with MAP_FIXED), where one can unmap an existing mapping and reserve the virtual address space in an atomic fashion.

Utilizing this, we can ensure that we get an AV on use-after-dispose. Provided we keep track of all our reserved address regions, we can now register a global exception handling routine at the OS API level, that converts any AV in that region to some kind of ObjectDisposedException or similar.

In my scenario, i achieved this by registering a Vectored Exception Handler (see AddVectoredExceptionHandler) that runs before the CLR-registered one and modifying the Stack and Instruction pointer such that on continuation, the thread thinks by accessing the memory, it called a well-defined method and next has to throw a managed exception (Imagine my amaze when this truly yielded the made-up stacktrace).
Yes, this is relying on undefined internals, and is highly unportable, and besides that, it's just evil. Such things shall be done by the CLR and only there. So it should be seen as a proof-of-concept.

What's Missing?

Reference Escaping

After making sure that it's safe to access the address range as long as my custom file-mapping object has not been collected, I could simply engineer an unsafe Block base class, which provides (protected) access to some (readonly) ref TStruct for some blittable block layout struct and our beloved Span<T> for flexible sized data regions. These Blocks can then just reference the file-mapping object to not outlive them. Still, as soon as we get a Span<T> or a managed pointer, we have to ensure not to pass them somewhere else, residing on some thread's stack even after our file-mapping object has been collected, and risking the laboriously constructed safety guarantees.
An (in my viewpoint) ideal solution needs the support in the CLR: The feature of accounting some address space to the keep-alive-region of an object as proposed in #37227 (comment) closes this last reference tracking gap.
The additional computational effort for the GC shouldn't be noticeable, since precise references can be resolved by some kind of hashtable lookup, while interior pointers lookup require a tree lookup (I've heard something of a brick table but unfortunately I am not a GC expert by far). So in principle, additional regions should be integrateable in existing data structures for regular objects on the managed heap. The effort that has to be taken when changing GC code is much more of a concern.

Partial references

Until here, i kept quiet about another clash with existing safety assumptions:
Normally, a managed reference (or a Span<T>, which can be seen as a set of the former) is expected to just yield a value when accessed (or just quietly take a value when written to). Such a disposable mapping may yield some SorryYoureTooLateException at each access. In my use-case, this was the very intended behavior. But just providing a Span<T> and let the developer do call any third-party method with it, may lead to surprising behavior. Note here that this issue is not just artificially created with my approach. Any I/O error may lead to an AV, see #24767.

So what we actually have is some kind of partial reference (imagine partial ref), which either may

  • behave 100% safe, or,
  • yield a managed exception on access.

What for?

The use of ref enables comfortable and efficient access to memory regions, especially when combined with blittable structs, which define the layout of a particular region. To remain safe, a weakened version that makes the intention explicit and creates an intentional incompatibility to the unrestricted version would be necessary. This incompability is unidirectional, of course, similar to how every totally defined function is also partially defined. A partial version of Span<T> that indexes to a partial ref would complement such usage nicely.

Proposal

What im roughly outlining here is an alternative to the well-known MemoryMappedFile-class in a much more platform-independent manner than my proof-of-concept, with CLR support. Note that even without the concept of partial references or any language features in this direction, a reserving MemoryMappedFile could still benefit strongly from performant safe access, since the *ViewAccessor-object would not have to lock a SafeHandle on every access.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Aug 13, 2021
@ghost
Copy link

ghost commented Aug 13, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Disclaimer

This coarse proposal is maybe a bit long and not fitted for this place. Please move this issue if deemed necessary.
It can be seen as a coherent collection of ideas regarding safe access to MemoryMappedFiles and maybe similar unmanaged buffers to.
The envisioned ideas and proposals may very well be perceived as too special in their area of application, and/or too fundamentally changing to be considered useful in the context of the .NET runtime.
I'm inviting to some discussion, whether you have thought of / implemented something similar in the past, have collected valuable experience on such an approach, e.g. why it's deemed to fail, or may find it quite interesting for a similar scenario not mentioned here.

Background and Motivation

General Motivation

Since the introduction of .NET Core, more effort has been invested towards making the C#/.NET ecosystem suitable for scenarios where performance matters.
Especially with the design of Span<T> and its subsequent big march through the APIs all over the .NET standard library, it has been explored to which extent performance-beneficial low-level access can be enabled while uncompromisingly maintaining the safety promise of the CLR.

When dealing with MemoryMappedFiles, this movement yet came to halt on every encounter. Examples are #37227 and #24767

Concrete Background/Example

As a working student, i had been reworking a library for reading a particular format of measurement files, regularly encountered in the automotive industry. It's constituted of a rather archaic system of blocks, linking to each other and containing fixed fields and flexibly sized data pools. Combine this with big file sizes (in the order of 100MBs to GBs), where often only some parts of the file are necessary to gather the needed information, or only some parts of a long series of measured samples are interesting for fulfilling a query.
In such scenario, memory mapped files provide for an excellent combination of

  • Nicely caching read data, thus evading frequent system calls
  • Not putting memory pressure on the system (e.g. compared to when reading into larger byte arrays)

Since the entire tool ecosystem at my workplace using C# and it being a very comfortable language, it was natural to stay with it. Furthermore, solutions including native interop distinctly increase the workload and don't perform too good, at least when considering a very talkative API.

At this point it was clear that i was going to use memory mapping, with all it's safety consequences.
One therefore has to decide how to manage the memory unsafety and hide it to the library consumer. The file object could be disposed and such the mapping handle closed by the user at any point in time. Note that not every implementation of memory mappings has made this design choice (see Javas MappedByteBuffer, which keeps the file opened until GC kicks in), but not doing so sensitively hits the user experience, having to deal with in-use files for an undeterministical duration of time.
To avoid Access Violations, my code would have to check and hold a lock at least at every entrance to the library. This is both related to tedious work and dumb errors when forgetting this procedure someplace, and seriously hits performance when applied together with a talkative API. Imagine e.g. an indexable wrapper over the values of a particular measurement signal, acquiring a lock with ~20ns overhead for each access, when the operation itself would only take a fraction of this time.

Instead, i got inspired by the java example and thougt of some address space reserving scheme. (see illustration below)

address-reservation-scheme

We can aim to create a file mapping in the constructor of a custom file-mapping object. On disposal, we replace it with a mere reservation, that ensures no other allocation ends up there. This reservation can then be kept until the GC finally collects the file-mapping object. This scheme naturally induces a higher usage of address space. But on current 64-bit systems, we usually have plenty of it and thus it's not really a concern anymore. Even though my Intel processor internally only can utilize 48-bit, and my current version of Win10 only allows allocation up to 47-bit of address space, that are still many TBs of address space. A test run that allocated address space with this scheme in a loop (keeping track of the managed objects to not have the finalizer deallocate the regions) needed quite some time until it hit the ceiling.

This is made possible by the newer Win32 placeholder API (see MapViewOfFile3 and MEM_REPLACE_PLACEHOLDER), and AFAIK (but not tried) much longer on linux systems (see mmap(2) with MAP_FIXED), where one can unmap an existing mapping and reserve the virtual address space in an atomic fashion.

Utilizing this, we can ensure that we get an AV on use-after-dispose. Provided we keep track of all our reserved address regions, we can now register a global exception handling routine at the OS API level, that converts any AV in that region to some kind of ObjectDisposedException or similar.

In my scenario, i achieved this by registering a Vectored Exception Handler (see AddVectoredExceptionHandler) that runs before the CLR-registered one and modifying the Stack and Instruction pointer such that on continuation, the thread thinks by accessing the memory, it called a well-defined method and next has to throw a managed exception (Imagine my amaze when this truly yielded the made-up stacktrace).
Yes, this is relying on undefined internals, and is highly unportable, and besides that, it's just evil. Such things shall be done by the CLR and only there. So it should be seen as a proof-of-concept.

What's Missing?

Reference Escaping

After making sure that it's safe to access the address range as long as my custom file-mapping object has not been collected, I could simply engineer an unsafe Block base class, which provides (protected) access to some (readonly) ref TStruct for some blittable block layout struct and our beloved Span<T> for flexible sized data regions. These Blocks can then just reference the file-mapping object to not outlive them. Still, as soon as we get a Span<T> or a managed pointer, we have to ensure not to pass them somewhere else, residing on some thread's stack even after our file-mapping object has been collected, and risking the laboriously constructed safety guarantees.
An (in my viewpoint) ideal solution needs the support in the CLR: The feature of accounting some address space to the keep-alive-region of an object as proposed in #37227 (comment) closes this last reference tracking gap.
The additional computational effort for the GC shouldn't be noticeable, since precise references can be resolved by some kind of hashtable lookup, while interior pointers lookup require a tree lookup (I've heard something of a brick table but unfortunately I am not a GC expert by far). So in principle, additional regions should be integrateable in existing data structures for regular objects on the managed heap. The effort that has to be taken when changing GC code is much more of a concern.

Partial references

Until here, i kept quiet about another clash with existing safety assumptions:
Normally, a managed reference (or a Span<T>, which can be seen as a set of the former) is expected to just yield a value when accessed (or just quietly take a value when written to). Such a disposable mapping may yield some SorryYoureTooLateException at each access. In my use-case, this was the very intended behavior. But just providing a Span<T> and let the developer do call any third-party method with it, may lead to surprising behavior. Note here that this issue is not just artificially created with my approach. Any I/O error may lead to an AV, see #24767.

So what we actually have is some kind of partial reference (imagine partial ref), which either may

  • behave 100% safe, or,
  • yield a managed exception on access.

What for?

The use of ref enables comfortable and efficient access to memory regions, especially when combined with blittable structs, which define the layout of a particular region. To remain safe, a weakened version that makes the intention explicit and creates an intentional incompatibility to the unrestricted version would be necessary. This incompability is unidirectional, of course, similar to how every totally defined function is also partially defined. A partial version of Span<T> that indexes to a partial ref would complement such usage nicely.

Proposal

What im roughly outlining here is an alternative to the well-known MemoryMappedFile-class in a much more platform-independent manner than my proof-of-concept, with CLR support. Note that even without the concept of partial references or any language features in this direction, a reserving MemoryMappedFile could still benefit strongly from performant safe access, since the *ViewAccessor-object would not have to lock a SafeHandle on every access.

Author: ChristophTF
Assignees: -
Labels:

area-System.IO, untriaged

Milestone: -

@adamsitnik adamsitnik added this to the Future milestone Aug 13, 2021
@adamsitnik adamsitnik added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Aug 13, 2021
@adamsitnik
Copy link
Member

@jkotas @stephentoub dear architects, what are your thoughts on this?

@jkotas
Copy link
Member

jkotas commented Aug 13, 2021

@ChristophTF Nice write up!

what are your thoughts on this?

This is combining ideas discussed in #37227 (comment) and #24767 to provide safe and performant accessor for memory mapped files.

I agree that it would work with the performance caveat mentioned (the memory virtual space would be released only after the GC runs). Also, it assumes that the platform supports handling of memory faults that is not the case on some Xamarin platforms (Apple device OSes).

It is domain-specific solution for the general explicit lifetime tracking safety problem that we have e.g. with ArrayPool memory. If we ever decide to tackle the general explicit lifetime tracking safety problem it would likely become redundant.

Partial references

Introducing a new type of references for this niche purpose is non-starter. It would be very expensive complexity-wise. Rather, the callback from #24767 would be allowed to throw whatever exception it decides to throw for accessing regular references. Or attempt to access unmapped memory would be a unrecoverable hard-fault that would also solve the concern with supporting this on Xamarin platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

4 participants