Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Linux/arm64: 7 Regressions on 2/23/2024 10:12:07 PM #99121

Open
performanceautofiler bot opened this issue Feb 29, 2024 · 9 comments
Open

[Perf] Linux/arm64: 7 Regressions on 2/23/2024 10:12:07 PM #99121

performanceautofiler bot opened this issue Feb 29, 2024 · 9 comments
Assignees
Labels
arch-arm64 area-System.Memory os-linux Linux OS (any supported distro) Priority:2 Work that is important, but not critical for the release runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Feb 29, 2024

Run Information

Name Value
Architecture arm64
OS ubuntu 22.04
Queue AmpereUbuntu
Baseline d98af689a245bbc983ea71c52e15ff9cdf376ec7
Compare 7b54246a7bd6b4ea09895b22ba30e45059fbedb4
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Memory.Span<Char>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
58.63 ns 66.41 ns 1.13 0.01 True
3.42 ns 5.25 ns 1.53 0.57 True
31.62 ns 34.48 ns 1.09 0.16 False

graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Memory.Span&lt;Char&gt;*'

Payloads

Baseline
Compare

System.Memory.Span<Char>.SequenceEqual(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Char>.EndsWith(Size: 4)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Char>.EndsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture arm64
OS ubuntu 22.04
Queue AmpereUbuntu
Baseline d98af689a245bbc983ea71c52e15ff9cdf376ec7
Compare 7b54246a7bd6b4ea09895b22ba30e45059fbedb4
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Memory.Span<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
57.88 ns 64.75 ns 1.12 0.03 True
56.57 ns 64.96 ns 1.15 0.01 True
108.09 ns 127.47 ns 1.18 0.01 True
3.62 ns 5.11 ns 1.41 0.55 True

graph
graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Memory.Span&lt;Int32&gt;*'

Payloads

Baseline
Compare

System.Memory.Span<Int32>.EndsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.StartsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.SequenceEqual(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.EndsWith(Size: 4)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arch-arm64 os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime untriaged New issue has not been triaged by the area owner labels Feb 29, 2024
@EgorBo EgorBo transferred this issue from dotnet/perf-autofiling-issues Feb 29, 2024
@ghost
Copy link

ghost commented Feb 29, 2024

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Name Value
Architecture arm64
OS ubuntu 22.04
Queue AmpereUbuntu
Baseline d98af689a245bbc983ea71c52e15ff9cdf376ec7
Compare 7b54246a7bd6b4ea09895b22ba30e45059fbedb4
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Memory.Span<Char>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
58.63 ns 66.41 ns 1.13 0.01 True
3.42 ns 5.25 ns 1.53 0.57 True
31.62 ns 34.48 ns 1.09 0.16 False

graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Memory.Span&lt;Char&gt;*'

Payloads

Baseline
Compare

System.Memory.Span<Char>.SequenceEqual(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Char>.EndsWith(Size: 4)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Char>.EndsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture arm64
OS ubuntu 22.04
Queue AmpereUbuntu
Baseline d98af689a245bbc983ea71c52e15ff9cdf376ec7
Compare 7b54246a7bd6b4ea09895b22ba30e45059fbedb4
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Memory.Span<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
57.88 ns 64.75 ns 1.12 0.03 True
56.57 ns 64.96 ns 1.15 0.01 True
108.09 ns 127.47 ns 1.18 0.01 True
3.62 ns 5.11 ns 1.41 0.55 True

graph
graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Memory.Span&lt;Int32&gt;*'

Payloads

Baseline
Compare

System.Memory.Span<Int32>.EndsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.StartsWith(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.SequenceEqual(Size: 512)

ETL Files

Histogram

JIT Disasms

System.Memory.Span<Int32>.EndsWith(Size: 4)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

arch-arm64, area-System.Memory, os-linux, untriaged, runtime-coreclr

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented Feb 29, 2024

Looks to be #98700

@EgorBo EgorBo self-assigned this Feb 29, 2024
@EgorBo EgorBo added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark and removed untriaged New issue has not been triaged by the area owner labels Feb 29, 2024
@stephentoub stephentoub added this to the 9.0.0 milestone Jul 22, 2024
@EgorBo EgorBo added the Priority:2 Work that is important, but not critical for the release label Jul 22, 2024
@EgorBo

This comment was marked as resolved.

@EgorBot

This comment was marked as resolved.

@EgorBo

This comment was marked as resolved.

@EgorBot

This comment was marked as resolved.

@EgorBo
Copy link
Member

EgorBo commented Jul 26, 2024

@EgorBot -arm64 -commit 973ceee vs previous --disasm --envvars "DOTNET_JitDisasm:SequenceEqual"

using BenchmarkDotNet.Attributes;

[GenericTypeArguments(typeof(int))]
public class Span<T>
    where T : struct, IComparable<T>, IEquatable<T>
{
    [Params(512)]
    public int Size;

    private T[] _array, _same, _emptyWithSingleValue;
    private T[] _fourValues, _fiveValues;
    private T _notDefaultValue;

    [GlobalSetup]
    public void Setup()
    {
        T[] array = new T[Size * 2];
        _array = array.Take(Size).ToArray();
        _same = _array.ToArray();
    }

    [Benchmark]
    public bool SequenceEqual() => new System.Span<T>(_array)
        .SequenceEqual(new ReadOnlySpan<T>(_same));
}

@EgorBot
Copy link

EgorBot commented Jul 26, 2024

Benchmark results on Arm64
BenchmarkDotNet v0.13.12, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Unknown processor
  Job-HMJFUN : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
  Job-IWZNQY : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
EnvironmentVariables=DOTNET_JitDisasm=SequenceEqual
Method Toolchain Size Mean Error Ratio Code Size
SequenceEqual Main 512 127.5 ns 0.02 ns 1.00 72 B
SequenceEqual PR 512 109.2 ns 0.00 ns 0.86 444 B

BDN_Artifacts.zip

@EgorBo
Copy link
Member

EgorBo commented Aug 1, 2024

It turns out to be tail-call + special intrinsic problem, we have an issue for it somewhere, I'll take a look in 10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 area-System.Memory os-linux Linux OS (any supported distro) Priority:2 Work that is important, but not critical for the release runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

3 participants