Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid long busy-waiting between hijack retries. #103212

Merged
merged 3 commits into from
Jun 28, 2024
Merged

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Jun 9, 2024

Fixes: #102832

  • refactors the pause routine into minipal_microdelay - to have the same implementation for CoreCLR and NativeAOT
  • use sub-millisecond sleep API, when available, for long(ish) pauses.
  • couple minor fixes around sending suspension signals.
    (not responsible for the regression, just possible corner cases noticed while investigating)

@VSadov VSadov added runtime-coreclr specific to the CoreCLR runtime and removed area-PAL-coreclr labels Jun 9, 2024
return;
}
#else
if (usecs > 10)
Copy link
Member Author

@VSadov VSadov Jun 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix. When we need to pause for longer than 10 microseconds, we sleep instead of spinning, thus freeing compute resources to other threads (including ones that we want to progress towards suspending).

Copy link
Member Author

@VSadov VSadov Jun 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 is a somewhat arbitrary number in the order of context switch cost. (2-10 microseconds).
We are not very sensitive to the exact value. It just means "too short for the sleep to have any advantages".

Copy link
Member Author

@VSadov VSadov Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other reason for 10 is that the default timer slack on Linux is 50 microseconds. So the shortest sleep could be on average 25 microseconds anyways if our wakeups are bunched up with other sleepers.
Thus 10 seems like a reasonable threshold to use sleep.

@VSadov
Copy link
Member Author

VSadov commented Jun 9, 2024

Performance impact on Stage1 ASP.Net benchmark.
The change recovers the loss form regression.

======================================================== arm64

======= before regression

| First Request (ms)      | 171        |
| Requests/sec            | 760,970    |
| Requests                | 11,490,582 |
| Mean latency (ms)       | 0.78       |
| Max latency (ms)        | 38.00      |


======== after regression

| First Request (ms)      | 164       |
| Requests/sec            | 627,518   |
| Requests                | 9,458,579 |
| Mean latency (ms)       | 1.65      |
| Max latency (ms)        | 54.40     |


======== after the fix

| First Request (ms)      | 160        |
| Requests/sec            | 770,416    |
| Requests                | 11,632,411 |
| Mean latency (ms)       | 0.78       |
| Max latency (ms)        | 40.54      |

======================================================= x64

======= before regression

| First Request (ms)      | 181        |
| Requests/sec            | 813,308    |
| Requests                | 12,280,780 |
| Mean latency (ms)       | 0.46       |
| Max latency (ms)        | 44.08      |

======== after regression

| First Request (ms)      | 179        |
| Requests/sec            | 771,124    |
| Requests                | 11,643,639 |
| Mean latency (ms)       | 0.56       |
| Max latency (ms)        | 43.84      |

======== after the fix

| Start Time (ms)         | 0          |
| First Request (ms)      | 185        |
| Requests/sec            | 803,472    |
| Requests                | 12,132,099 |
| Mean latency (ms)       | 0.44       |
| Max latency (ms)        | 36.27      |

commands used:

===================  arm64

crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile arm-lin-28-app --profile amd-lin-load --profile amd-lin2-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24259.7


crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile arm-lin-28-app --profile amd-lin-load --profile amd-lin2-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24260.2


crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile arm-lin-28-app --profile amd-lin-load --profile amd-lin2-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24260.2   --application.options.outputFiles armFix\libcoreclr.so


=================== x64

crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile intel-lin-app --profile intel-load-load --profile intel-db-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24259.7


crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile intel-lin-app --profile intel-load-load --profile intel-db-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24260.2


crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/goldilocks.benchmarks.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml --scenario basicminimalapivanilla --profile intel-lin-app --profile intel-load-load --profile intel-db-db --application.environmentVariables DOTNET_GCDynamicAdaptationMode=1 --application.framework net9.0 --application.options.collectCounters true --load.options.reuseBuild true --application.aspNetCoreVersion 9.0.0-preview.5.24256.2 --application.sdkVersion 9.0.100-preview.5.24267.1 --application.runtimeVersion 9.0.0-preview.5.24260.2   --application.options.outputFiles x64Fix\libcoreclr.so

@VSadov VSadov added area-VM-coreclr and removed runtime-coreclr specific to the CoreCLR runtime labels Jun 9, 2024
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

src/native/minipal/time.c Outdated Show resolved Hide resolved
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
@VSadov
Copy link
Member Author

VSadov commented Jun 28, 2024

Thanks!

@VSadov VSadov merged commit 9501cce into dotnet:main Jun 28, 2024
161 checks passed
@VSadov VSadov deleted the microSleep branch June 28, 2024 21:12
@github-actions github-actions bot locked and limited conversation to collaborators Jul 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Performance] Regression in multiple scenarios, Linux, more significant on ARM64
2 participants