-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a time-based decay to the linear allocation model. #55174
Add a time-based decay to the linear allocation model. #55174
Conversation
Real world scenario that motivated this allocated a huge amount of data once a day, leading to high gen 0 and gen 1 budgets (several GB). After this, there was relatively little activity, cause gen 1 budget to take several hours to normalize. The new logic discounts the previous desired allocation over a period of 5 minutes.
Tagging subscribers to this area: @dotnet/gc Issue DetailsReal world scenario that motivated this allocated a huge amount of data once a day, leading to high gen 0 and gen 1 budgets (several GB). After this, there was relatively little activity, cause gen 1 budget to take several hours to normalize. The new logic discounts the previous desired allocation over a period of 5 minutes.
|
This addresses issue #52592. |
@@ -36767,6 +36765,9 @@ size_t gc_heap::desired_new_allocation (dynamic_data* dd, | |||
float f = 0; | |||
size_t max_size = dd_max_size (dd); | |||
size_t new_allocation = 0; | |||
float time_since_previous_collection_secs = (dd_time_clock (dd) - dd_previous_time_clock (dd))*1e-6f; | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove one empty line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, I think I should remove both of the empty lines - no reason to separate the declarations here, I think.
{ | ||
if ((allocation_fraction < 0.95) && (allocation_fraction > 0.0)) | ||
{ | ||
const float decay_time = 5*60.0f; // previous desired allocation expires over 5 minutes | ||
float decay_factor = (decay_time <= time_since_previous_collection_secs) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we build a test scenario in GCPerfSim to validate that the decay has the desired effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try, but it may be a bit involved.
this seems to be along the same lines of what we are doing for WKS GC in also the way the change is now is modifying the existing behavior, it seems like it would be better to not modify the existing behavior when a GC happens soon enough, and start decaying when the elapsed time between this and last GC (of that generation) is longer than decay_time (and the longer it has been the less effect it should have). what do you think? |
I thought about making the decay_time generation aware along the lines you suggested, but didn't see a good reason to do it. Gen 1 is the only generation where we collect routinely without having exhausted the budget. Regarding your remark about modifying existing behavior, I did this on purpose so a case where gen 1 GCs happen often enough, but very early (i.e. only exhausting a tiny fraction of the gen 1 budget), wouldn't lead to us taking a long time to ramp down the gen 1 budget. As it happens, a decay time of 5 minutes is short enough to fix the test case (which does about 6 gen 1 GCs in the first hour after loading the big chunk of data), but it seems unwise to rely on this - with more load, of course the interval between gen 1 GCs may drop below the decay_time. |
in high memory load situation we could also collect gen2 often without having exhausted the budget but you could argue that the next gen2 will also happen without having exhausted budget due to the same reason and also gen2's surv rate is usually very high anyway.
it's always a worry that when you change the existing perf behavior you will regress someone, thus the concern. also the idea of the fraction is we take some percentage from the current budget and some from the previous budget, but now with decay_factor we are simply discounting some of budget. shouldn't this be
if we want to make the previous budget count for less based on the time factor? |
float prev_allocation_factor = (1.0 - allocation_fraction) * decay_factor; new_allocation = (size_t)((1.0 - prev_allocation_factor) * new_allocation + I think you are absolutely right. Will change the code accordingly. |
…out in code review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Real world scenario that motivated this allocated a huge amount of data once a day, leading to high gen 0 and gen 1 budgets (several GB).
After this, there was relatively little activity, cause gen 1 budget to take several hours to normalize.
The new logic discounts the previous desired allocation over a period of 5 minutes.