-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update handling of limited register during consecutive registers allocation #84588
Update handling of limited register during consecutive registers allocation #84588
Conversation
If we find out that there are no candidates free/busy for refPositions that need consecutive registers, have at least one range of registers in the candidates such that allocation is possible.
Intially, we were just returning RBM_NONE if we don't find any freeCandidates, but instead should try if we can find out if there are any busy candidates that we should try them out.
…osition If consecutive registers are being allocated, other refpositions that are live at the same location might not have enough registers left to be assigned because all registers are busy. As such, introduce a way to track if we are assigning at the location of consecutive registers, and if yes, do not take jitstressregs limit into account.
For consecutive register, also include the register count needed for "minimum register requirement" when limiting the registers.
With other conditions in place, no need to have LsraLimitFPSetForConsecutive.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsEarlier, I added a limit mask to include few registers in case we are allocating during consecutive registers. However, the mask was not sufficient, and we were still running into situation where we won't have register to allocate to one of the consecutive registers or a refPosition that is live at the same location. Updated the handling of such cases:
Fixes: #84536
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple questions
src/coreclr/jit/lsra.cpp
Outdated
@@ -12195,6 +12241,10 @@ regMaskTP LinearScan::RegisterSelection::select(Interval* currentInterval, | |||
regMaskTP busyRegs = linearScan->regsBusyUntilKill | linearScan->regsInUseThisLocation; | |||
candidates &= ~busyRegs; | |||
|
|||
#ifdef DEBUG | |||
inUseOrBusyRegsMask |= ~busyRegs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be:
inUseOrBusyRegsMask |= ~busyRegs; | |
inUseOrBusyRegsMask |= busyRegs; |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below, I am using this as registerAssignment & inUseOrBusyRegsMask
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a separate one for just inUseRegsMask
to help differentiate?
It's confusing to see clearing all the busy regs here given the local name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have incorporated Bruce's feedback.
src/coreclr/jit/lsra.cpp
Outdated
@@ -12210,6 +12260,9 @@ regMaskTP LinearScan::RegisterSelection::select(Interval* currentInterval, | |||
(refPosition->delayRegFree && (checkConflictLocation == (refPosition->nodeLocation + 1)))) | |||
{ | |||
candidates &= ~checkConflictBit; | |||
#ifdef DEBUG | |||
inUseOrBusyRegsMask |= ~checkConflictBit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inUseOrBusyRegsMask |= ~checkConflictBit; | |
inUseOrBusyRegsMask |= checkConflictBit; |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below, I am using this as registerAssignment & inUseOrBusyRegsMask
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what if you |
together two different conflict bits? Then all the bits will be set and registerAssignment & inUseOrBusyRegsMask
will do nothing. Is that right? Don't you need to |
together all the bits first and then registerAssignment & ~inUseOrBusyRegsMask
? At the very least, inUseOrBusyRegsMask
as a name doesn't make sense since in your usage it's (kind of) the opposite of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true.
Seems build failures were fixed by #84595. |
Isn't this PR supposed to fix this failure in particular? |
Ah, meant to have this comment for the other PR. Will update my comment. |
-- As with #84634 (comment), it looks like the SPMI asserts that are still present are from This PR, compared to a previous PR has "half" the asserts with older PRs listing failures for both |
Earlier, I added a limit mask to include few registers in case we are allocating during consecutive registers. However, the mask was not sufficient, and we were still running into situation where we won't have register to allocate to one of the consecutive registers or a refPosition that is live at the same location. Updated the handling of such cases:
minRegCount
, take into account the refpositions for consecutive registers.LsraLimitFPSetForConsecutive
.Fixes: #84536