Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Extend Memory Mapped Files support #59776

Open
msedi opened this issue Sep 29, 2021 · 4 comments
Open

[API Proposal]: Extend Memory Mapped Files support #59776

msedi opened this issue Sep 29, 2021 · 4 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@msedi
Copy link

msedi commented Sep 29, 2021

Background and motivation

Currently, the MemoryMappedFile API is great but doesn't expose some properties that are available in the WinAPI.
While I understand that also other platforms are currently supported it would be nice to find a common way to extend the API.

The suggestions are:

Enable FileOptions
Currently when using MemoryMappedFile.CreateFromFile no FileOptions property is allowed to be handed over to this method. Some of the enums in FileOptions may be useless, some not (e.g. FileOptions.DeleteOnClose). Currently there exists a
MemoryMappedFileOptions that could be extended so that the options can be abstracted rather than using FileOptions directly.

Large Page Support
It is not possible to use Large_Pages, from the WinAPI I can see that it is only possible in system paging files and not in file backed paging file, but I don't have too much experience on that.

More control over memory
It's currently not possible to invalidate memory (DiscardVirtualMemory) so that the memory manager can ignore these areas and will not write it back to the paging file. Additionally it is currently not possible to mark pages as "not in use" (VirtualUnlock) so that the memory manager is able to page them earlier. Also it is not possible to prefetch pages (PrefetchVirtualMemory).

Better backing file control
It seems that some options but I do not have good benchmarks allow for better performance. SetFileValidData and Sparse File Support are the keywords. My current tests showed that using sparse files, had an improvement from 150s to 100s.

Flush areas
FlushViewOfFile allows more control of which areas are flushed and should be available in MemoryMappedViewAccessor.Flush.

Do not flush the view on disposal
Currently the filestream is flushed when disposing the memory mapped structures, which are causing enormous performance drop when the file is opened with FileOptions.DeleteOnClose. Since the file is deleted on disposal it doesn't really make sense to flush it to the backing file.

Control over the working set
While working with memory mapped files I was produced a tremendous amount of memory and came very quickly to a point where the memory was exhausted and the memory manager started to page and to offload the data to the pagefile. While this behavior is OK and is dependent on the OS (I was told that linux handles memory mapped files better?), it started way too late so I built my own "unmanaged GC" which watched in a thread over the memory mapped files. The problem was though that even when I unlocked the memory region (VirtualUnlock) the workingset was still high so I need to enforce a flush of the working set. All of the Virtual* methods are non-blocking though and I haven't found a good way to wait until the workingset reached a better condition. Calling EmptyWorkingSet is maybe not the best solution here (and its also non-blocking). Some other approaches would be welcome. I also assume that putting EmptyWorkingSet in the API proposal would cause unwanted effects. But I have put it here for discussion.

It would be good if the API methods are abstracted somehow and the true API is not exposed.
It is of course welcome to discuss this proposal.

Also, I'm not too advanced in memory mapped files.

API Proposal

namespace System.IO.MemoryMappedFiles
{
it.
    // With this approach nothing has to be changed on the existing factory routines (to be discussed).
    [Flags]
    public enum MemoryMappedFileOptions
    {
        None = 0,
        // Proposal to create a create a MemoryMappedFile with the DeleteOnClose flag so that the Dispose routines can respect         
        DeleteOnClose = 0x2,
        // Additionally add an enum to support large pages.
        LargePageSupport = 0x4,
        // Allocates a sparse file
        Sparse = 0x08,
        // Set the valid data size
        ValidData = 0x10,
        DelayAllocatePages = 0x4000000
    }

    public class MemoryMappedViewAccessor
    {
       // Flushes the given region (of course a modulo of the page size)
       // While it is only a hint to the memory manager, with flush to disk the flush is enforced.
      // see FlushViewOfFile and FlushFileBuffers
       public void Flush(nint start, nint length, bool flushToDisk = false);

       // Invalidates the given region (of course a modulo of the page size) and makes it free to the memory manager
       // see DiscardVirtualMemory
       public void Discard(nint start, nint length);

       // Unlocks a given memory region and tells the memory manager that the region can be paged.
       // see VirtualUnlock
       public void Free(nint start, nint length);

       // Advises the memory manager to prefetches the given memory region.
       public void Prefetch(nint start, nint length);

       // I'm  not sure if following two methods make sense with memory mapped files. 
       public void Offer();
       public void Reclaim();
    }
}

API Usage

var mm = MemoryMappedFile.CreateFromFile("file.dat", FileMode.CreateNew, null, 10000, MemoryMappedFileAccess.ReadWrite, MemoryMappedFileOptions.DeleteOnClose);

using var va = mm.CreateViewAccessor();

// Prefetch all 10000 bytes
va.Prefetch(0, 10000);

// Discard all 10000 bytes and tell the memory manager that no flush on this area is needed
va.Discard(0, 10000);

// Free 9000 bytes and tell the memory manager that these region is currently not in use and can be paged to the disk if nedded
va.Free(1000, 10000);

// Flush all 10000 bytes and wait until they are written to disk
va.Flush(0, 10000, true);




### Risks

The risks are of course if memory mapped files need to be platform independent to find proper correspondence in the other OSes.

There are for sure things I didn't get correctly so please feel free to correct me on my mistakes ;-)
@msedi msedi added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Sep 29, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Sep 29, 2021
@ghost
Copy link

ghost commented Sep 29, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Currently, the MemoryMappedFile API is great but doesn't expose some properties that are available in the WinAPI.
While I understand that also other platforms are currently supported it would be nice to find a common way to extend the API.

The suggestions are:

Enable FileOptions
Currently when using MemoryMappedFile.CreateFromFile no FileOptions property is allowed to be handed over to this method. Some of the enums in FileOptions may be useless, some not (e.g. FileOptions.DeleteOnClose). Currently there exists a
MemoryMappedFileOptions that could be extended so that the options can be abstracted rather than using FileOptions directly.

Large Page Support
It is not possible to use Large_Pages, from the WinAPI I can see that it is only possible in system paging files and not in file backed paging file, but I don't have too much experience on that.

More control over memory
It's currently not possible to invalidate memory (DiscardVirtualMemory) so that the memory manager can ignore these areas and will not write it back to the paging file. Additionally it is currently not possible to mark pages as "not in use" (VirtualUnlock) so that the memory manager is able to page them earlier. Also it is not possible to prefetch pages (PrefetchVirtualMemory).

Better backing file control
It seems that some options but I do not have good benchmarks allow for better performance. SetFileValidData and Sparse File Support are the keywords. My current tests showed that using sparse files, had an improvement from 150s to 100s.

Flush areas
FlushViewOfFile allows more control of which areas are flushed and should be available in MemoryMappedViewAccessor.Flush.

Do not flush the view on disposal
Currently the filestream is flushed when disposing the memory mapped structures, which are causing enormous performance drop when the file is opened with FileOptions.DeleteOnClose. Since the file is deleted on disposal it doesn't really make sense to flush it to the backing file.

Control over the working set
While working with memory mapped files I was produced a tremendous amount of memory and came very quickly to a point where the memory was exhausted and the memory manager started to page and to offload the data to the pagefile. While this behavior is OK and is dependent on the OS (I was told that linux handles memory mapped files better?), it started way too late so I built my own "unmanaged GC" which watched in a thread over the memory mapped files. The problem was though that even when I unlocked the memory region (VirtualUnlock) the workingset was still high so I need to enforce a flush of the working set. All of the Virtual* methods are non-blocking though and I haven't found a good way to wait until the workingset reached a better condition. Calling EmptyWorkingSet is maybe not the best solution here (and its also non-blocking). Some other approaches would be welcome. I also assume that putting EmptyWorkingSet in the API proposal would cause unwanted effects. But I have put it here for discussion.

It would be good if the API methods are abstracted somehow and the true API is not exposed.
It is of course welcome to discuss this proposal.

Also, I'm not too advanced in memory mapped files.

API Proposal

namespace System.IO.MemoryMappedFiles
{
it.
    // With this approach nothing has to be changed on the existing factory routines (to be discussed).
    [Flags]
    public enum MemoryMappedFileOptions
    {
        None = 0,
        // Proposal to create a create a MemoryMappedFile with the DeleteOnClose flag so that the Dispose routines can respect         
        DeleteOnClose = 0x2,
        // Additionally add an enum to support large pages.
        LargePageSupport = 0x4,
        // Allocates a sparse file
        Sparse = 0x08,
        // Set the valid data size
        ValidData = 0x10,
        DelayAllocatePages = 0x4000000
    }

    public class MemoryMappedViewAccessor
    {
       // Flushes the given region (of course a modulo of the page size)
       // While it is only a hint to the memory manager, with flush to disk the flush is enforced.
      // see FlushViewOfFile and FlushFileBuffers
       public void Flush(nint start, nint length, bool flushToDisk = false);

       // Invalidates the given region (of course a modulo of the page size) and makes it free to the memory manager
       // see DiscardVirtualMemory
       public void Discard(nint start, nint length);

       // Unlocks a given memory region and tells the memory manager that the region can be paged.
       // see VirtualUnlock
       public void Free(nint start, nint length);

       // Advises the memory manager to prefetches the given memory region.
       public void Prefetch(nint start, nint length);

       // I'm  not sure if following two methods make sense with memory mapped files. 
       public void Offer();
       public void Reclaim();
    }
}

API Usage

var mm = MemoryMappedFile.CreateFromFile("file.dat", FileMode.CreateNew, null, 10000, MemoryMappedFileAccess.ReadWrite, MemoryMappedFileOptions.DeleteOnClose);

using var va = mm.CreateViewAccessor();

// Prefetch all 10000 bytes
va.Prefetch(0, 10000);

// Discard all 10000 bytes and tell the memory manager that no flush on this area is needed
va.Discard(0, 10000);

// Free 9000 bytes and tell the memory manager that these region is currently not in use and can be paged to the disk if nedded
va.Free(1000, 10000);

// Flush all 10000 bytes and wait until they are written to disk
va.Flush(0, 10000, true);




### Risks

The risks are of course if memory mapped files need to be platform independent to find proper correspondence in the other OSes.

There are for sure things I didn't get correctly so please feel free to correct me on my mistakes ;-)

<table>
  <tr>
    <th align="left">Author:</th>
    <td>msedi</td>
  </tr>
  <tr>
    <th align="left">Assignees:</th>
    <td>-</td>
  </tr>
  <tr>
    <th align="left">Labels:</th>
    <td>

`api-suggestion`, `area-System.IO`, `untriaged`

</td>
  </tr>
  <tr>
    <th align="left">Milestone:</th>
    <td>-</td>
  </tr>
</table>
</details>

@jeffhandley jeffhandley added this to the Future milestone Oct 4, 2021
@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Oct 4, 2021
@msedi
Copy link
Author

msedi commented Nov 5, 2023

Our team would be very interested in some more discussions about this topic. Would there be a chance to do so? We could even help in improving this, but we would need some agreement and further discussions. Since thare some interest in this topic I'll list (incomplete) them here for reference #59606, #57330, #37227, #59405. #62768, #69365, #48793, #941, #24990, #24805. Many of them are still open, many of them have been closed but not solved.

I can see that memory mapped files might be a niche topic, but I think there is interest.

@Scooletz
Copy link

Scooletz commented Sep 26, 2024

@jeffhandley Would it be possible to get your insight as as an area owner?

@Scooletz
Copy link

@msedi Would it be useful to have something for madvise won't need even if it was noop on Windows? DiscardVirtualMemory has a different semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

3 participants