Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PhysicalMemoryLimit capabilities for /proc/meminfo OS's like Linux. #47783

Merged

Conversation

StephenMolloy
Copy link
Member

Enable MemCache PhysicalMemoryLimit on Linux.

This fix uses procfs instead of System.GC.GetGCMemoryInfo
as was suggested in #24299. The choice to use /proc/meminfo is because
that method should work on pretty much any major Linux distro and it
gives us "accurate" information at the time of query. Using
GetGCMemoryInfo is using potentially stale information, though it might
be worth considering it as a last-ditch fallback for other Unix variants if
procfs isn't available?

Fix #1426

@jkotas
Copy link
Member

jkotas commented Feb 2, 2021

/proc/meminfo won't give you the right limit for containers.

bool availableValid = false;
total = available = 0;

if (File.Exists("/proc/meminfo"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible the file could exist but the current process doesn't have read permission. You should wrap the file read in a try/catch and make it behave the same way as if the file didn't exist.

}

// Other OS's support if they have /proc/meminfo with "MemAvailable"
if (File.Exists("/proc/meminfo"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as earlier one, consider file existing but not being readable

@mconnew
Copy link
Member

mconnew commented Feb 3, 2021

Regarding the comment by @jkotas, when in a container environment there are 2 scenarios. First, no memory quota on the container. In this case you probably still care about the host value here. Second, when you do have a memory quota on the container then you probably don't need to use this feature. This feature is to dynamically respond to other processes on the machine which are consuming memory. In the container model you only have the one process so there's no "other" to consider. In this case you can likely use a fixed size cache.

@StephenMolloy
Copy link
Member Author

/proc/meminfo won't give you the right limit for containers.

@jkotas , how would you suggest handling the container case then? I've been digging through the sources behind GC.GetGCMemoryInfo() and I'm not quite sure the numbers there are consistent for a container scenario either. In GCToOSInterface::GetMemoryStatus, the oddly named restricted_limit parameter is only non-zero when there is not restricted container limit. In which case, lines 1317-1326 calculate straightforward numbers as expected.

But if there is a restricetd limit, it looks like the call to GetAvailablePhysicalMemory on line 1329 will check /proc/meminfo@MemAvailable like we do here, or fall back to sysconf. Either way, this is not a container-aware stat. The call to GetPhysicalMemoryLimit on line 1334 does appear to get a total memory size that is container aware. So it's possible to get an 'available' value that is larger than 'total' and not even get a 'memory_load' calculation as a result.

Later, when gathering these stats into a struct to return to managed land, the memory load bytes field is calculated by multiplying the percentage load by total_physical_memory... which might be container aware? I can't tell for sure. But if it was container aware, then there wouldn't be a need to choose between it and 'heap_hard_limit' for totalAvailableMemoryBytes then.

To summarize, it doesn't look like GC.GetGCMemoryInfo is consistently container aware either. I'm not sure how I would use it in this case. Maybe I'm missing something? Or perhaps you have another solution in mind?

@StephenMolloy
Copy link
Member Author

Then again, as @mconnew mentioned - the original idea behind this monitor was to be a good citizen among other processes on the machine. In the container scenario, a cache of any significant size is likely to be part of the only citizen that matters within that container, so using a container aware monitor to be a good citizen within the container doesn't seem necessary.

On the other hand, using a host-aware monitor might be useful to help the container be a good citizen among other containers/processes on the host - especially if the cache-containing-container was not created with a quota.

@jkotas
Copy link
Member

jkotas commented Feb 13, 2021

the oddly named restricted_limit parameter is only non-zero when there is not restricted container limit.

It is the other way: restricted_limit is true when the container has a limit.

I do not have a strong opinion on this. If you believe that it does not make for this feature to take container limits into account, feel free to ignore my comment. It would be nice to document it somewhere.

@davidfowl
Copy link
Member

Seems like this should be using GC APIs to get memory information. The array pool does a similar thing and responds to memory pressure. Why not do a similar thing here

@StephenMolloy
Copy link
Member Author

Ok, I'm looking at GCHeap::Initialize again, and I see where total_physical_mem is being set near the beginning to either a .Net configured limit, or a os/container-configured limit, or the actual total amount of RAM installed. I'm still having difficulty seeing how I get a percentage load back from GCMemoryInfo though.

Internally the GC is tracking load as a percentage, but before exposing it, it multiplies that by total_physical_mem. That's fine. But how can I know what that value is to undo that calculation? It looks like total_physical_mem was exposed by GCMemoryInfo.TotalAvailableMemoryBytes previously... but now that field will either expose the total_physical_mem or it will expose 75% of that value in a restricted memory scenario. Or thats what it looks like to me in lines 39825-39843. heap_hard_limit will be set if it is configured as a percentage through .Net config, or it will be set to 75% of total memory in a "restricted" scenario. The end result is that I can't reliably use GCMemoryInfo.TotalAvailableMemoryBytes to convert back to an accurate memory load percentage.

A comment by @jkotas from the PR that introduced the hard limit to GCMemoryInfo suggests that in container scenarios, caches should care about this GC hard limit and not the actual memory load relative to the container memory limit. Perhaps that is true when managing array pools, and maybe that is a relevant consideration when the cache tuning being discussed is only concerned about it's own heap usage bumping into limits. But this cache monitor does not serve this purpose. It would rather know the status of memory use in the entire system/container so it can get out of other peoples way - not so it can get out of it's own way. And that's the way this cache has been doing it on Windows for years going back to .Net 4.X. Is there another way to get total_physical_mem from GC that I'm missing?

Base automatically changed from master to main March 1, 2021 09:07
@StephenMolloy
Copy link
Member Author

Updated PR to use GCMemoryInfo on non-Windows. It's not perfect, but the scenarios in which GCMemoryInfo do not match what we do on windows are restricted-memory scenarios... in which case there is not much need for this "good citizen in an environment shared with other concerns" memory monitor. This leaves the complexity of working with /proc/meminfo or sysctl or CLR memory limits to the GC system which has been doing this for years.

GCMemoryInfo memInfo = GC.GetGCMemoryInfo();
if (memInfo.Index <= lastGCIndex)
{
GC.Collect(0); // A quick, ephemeral Gen 0 collection
Copy link
Member

@mconnew mconnew Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using the overload which takes a GCCollectionMode and pass GCCollectionMode.Optimized. This allows the GC to determine when only a trivial amount of allocation has happened since the last GC and to skip it. In other words, if the memory situation hasn't changed an appreciable amount, it skips the GC.

Copy link
Member

@mconnew mconnew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned about forcing GC every time, so I suggested the more polite GC collect api. Not requiring that change, I'll leave it to you to decide.

/* There are sysconf and /proc/meminfo ways to get this information before .Net 3,
* but it is very complicated to do it correctly, especially when accounting for
* container scenarios. The GC does this for us in .Net 3.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, it's .NET Core when talking about <5. Also shouldn't this be 3.1?

@StephenMolloy StephenMolloy requested a review from a team April 2, 2021 19:28
@StephenMolloy StephenMolloy force-pushed the PhysicalMemoryLimit_for_Linux branch from e3cf058 to d242d29 Compare April 2, 2021 19:39
@StephenMolloy StephenMolloy removed the request for review from a team April 2, 2021 19:49
@StephenMolloy StephenMolloy merged commit 3a0e76e into dotnet:main Apr 2, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MemoryCache needs a way to handle physical memory pressure on Linux
5 participants