Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in File.PreallocationSize #59705

Closed
performanceautofiler bot opened this issue Sep 28, 2021 · 49 comments
Closed

Regression in File.PreallocationSize #59705

performanceautofiler bot opened this issue Sep 28, 2021 · 49 comments
Assignees
Labels
area-System.IO os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS ubuntu 18.04
Baseline a8c2d1eae1726b1e6ec50716b7d4c34bc21ba96e
Compare ea062deec80053b822734802b40642bea36bed33
Diff Diff

Regressions in System.IO.Tests.Perf_FileStream

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Write_NoBuffering_PreallocationSize - Duration of single invocation 34.65 ms 41.07 ms 1.19 0.05 True

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Tests.Perf_FileStream*'

Payloads

Baseline
Compare

Histogram

System.IO.Tests.Perf_FileStream.Write_NoBuffering_PreallocationSize(fileSize: 104857600, userBufferSize: 16384, options: None)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline a8c2d1eae1726b1e6ec50716b7d4c34bc21ba96e
Compare ea062deec80053b822734802b40642bea36bed33
Diff Diff

Regressions in System.Text.Json.Tests.Perf_Booleans

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
WriteBooleans - Duration of single invocation 1.23 ms 1.46 ms 1.18 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Tests.Perf_Booleans*'

Payloads

Baseline
Compare

Histogram

System.Text.Json.Tests.Perf_Booleans.WriteBooleans(Formatted: False, SkipValidation: True)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@kunalspathak kunalspathak changed the title [Perf] Changes at 9/23/2021 4:25:49 PM Regression in File.PreallocationSize Sep 28, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Sep 28, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Sep 28, 2021
@kunalspathak
Copy link
Member

Introduced by #59338

@kunalspathak
Copy link
Member

The regression in System.Text.Json.Tests.Perf_Booleans also points to the #59338 , so please double check if that was the cause.

@kunalspathak kunalspathak added arch-x64 os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark labels Sep 28, 2021
@kunalspathak
Copy link
Member

alpine regression - dotnet/perf-autofiling-issues#1590

@stephentoub
Copy link
Member

cc: @tmds

@tmds
Copy link
Member

tmds commented Sep 29, 2021

Introduced by #59338

The PR made a functional change in preallocationSize and that impacts Write_NoBuffering_PreallocationSize benchmark.
The functional change has the desired behavior.

The regression in System.Text.Json.Tests.Perf_Booleans also points to the #59338 , so please double check if that was the cause.

Perf_Booleans doesn't seem to use preallocationSize, so I don't think it has the same root cause.

@stephentoub
Copy link
Member

The PR made a functional change in preallocationSize and that impacts Write_NoBuffering_PreallocationSize benchmark.

Do we know why it had such an impact? Is the syscall being used now 20% slower?

@tmds
Copy link
Member

tmds commented Sep 29, 2021

Do we know why it had such an impact? Is the syscall being used now 20% slower?

The posix_fallocate and fallocate function both use the fallocate syscall. The difference is that we're now calling it with the FALLOC_FL_KEEP_SIZE to preserve the length. In theory, this means the kernel has less work since it doesn't need to care about 'filling the space', but that is not what the benchmark is telling us.

I don't think there is anything we can do about improving the runtime implementation.

For the benchmark, it uses tmpfs, though for preallocation it would be interesting to benchmark on an actual disk. I'm not sure how stable the results would be, but maybe the benchmarking framework accounts for some variance and outliers.

I'll run the benchmark locally and see if I learn something from that. I may not get to it immediately.

@tmds
Copy link
Member

tmds commented Sep 29, 2021

The regression in System.Text.Json.Tests.Perf_Booleans also points to the #59338 , so please double check if that was the cause.

@kunalspathak as I said in a previous comment, I don't think they are related. I wonder how the regression in Perf_Booleans points to #59338?

@ghost
Copy link

ghost commented Sep 29, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline a8c2d1eae1726b1e6ec50716b7d4c34bc21ba96e
Compare ea062deec80053b822734802b40642bea36bed33
Diff Diff

Regressions in System.IO.Tests.Perf_FileStream

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Write_NoBuffering_PreallocationSize - Duration of single invocation 34.65 ms 41.07 ms 1.19 0.05 True

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Tests.Perf_FileStream*'

Payloads

Baseline
Compare

Histogram

System.IO.Tests.Perf_FileStream.Write_NoBuffering_PreallocationSize(fileSize: 104857600, userBufferSize: 16384, options: None)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline a8c2d1eae1726b1e6ec50716b7d4c34bc21ba96e
Compare ea062deec80053b822734802b40642bea36bed33
Diff Diff

Regressions in System.Text.Json.Tests.Perf_Booleans

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
WriteBooleans - Duration of single invocation 1.23 ms 1.46 ms 1.18 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Tests.Perf_Booleans*'

Payloads

Baseline
Compare

Histogram

System.Text.Json.Tests.Perf_Booleans.WriteBooleans(Formatted: False, SkipValidation: True)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

area-System.IO, os-linux, tenet-performance, tenet-performance-benchmarks, arch-x64, untriaged

Milestone: -

@kunalspathak
Copy link
Member

@kunalspathak as I said in a previous comment, I don't think they are related. I wonder how the regression in Perf_Booleans points to #59338?

I agree it is hard to believe that, but when I saw the changes that went between the regressions, they were these:

ba43e08...44b4450

The benchmark does memory write using Utf8JsonWriter and ArrayBufferWriter which made us believe that it could be related. But feel free to ignore it for now.

@jozkee jozkee removed the untriaged New issue has not been triaged by the area owner label Sep 29, 2021
@jozkee jozkee added this to the 6.0.0 milestone Sep 29, 2021
@jozkee
Copy link
Member

jozkee commented Sep 29, 2021

I'll run the benchmark locally and see if I learn something from that. I may not get to it immediately.

@tmds thanks for checking this, let us know if you are unable to solve this in a couple of days as the .NET 6 deadline is close.

@tmds
Copy link
Member

tmds commented Sep 30, 2021

I ran this benchmark on my machine and I could not reproduce the regression.

I ran it using the tmpfs at /tmp, and using a folder that is mounted with btfs.

FALLOC_FL_KEEP_SIZE, tmpfs

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    283.7 us |   6.72 us |   7.74 us |    281.3 us |    273.5 us |    297.8 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 34,163.2 us | 656.63 us | 614.21 us | 34,141.0 us | 33,060.0 us | 35,250.9 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    282.7 us |   6.98 us |   7.75 us |    282.9 us |    273.3 us |    302.2 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 33,139.4 us | 675.28 us | 777.66 us | 32,829.2 us | 32,061.4 us | 34,963.5 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    288.0 us |   8.39 us |   9.67 us |    285.6 us |    273.0 us |    306.9 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 32,619.4 us | 587.96 us | 521.21 us | 32,711.9 us | 31,837.0 us | 33,677.7 us |

posix_fallocate, tmpfs

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    283.1 us |   5.96 us |   6.86 us |    280.3 us |    273.6 us |    296.7 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 32,940.5 us | 597.59 us | 529.74 us | 32,883.2 us | 32,078.2 us | 33,794.7 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    287.3 us |   4.00 us |   3.34 us |    288.1 us |    282.3 us |    292.2 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 33,009.4 us | 721.17 us | 830.49 us | 33,017.1 us | 31,802.2 us | 35,040.9 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    291.8 us |   9.20 us |  10.59 us |    287.0 us |    278.2 us |    313.1 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 33,213.2 us | 717.33 us | 826.08 us | 33,121.0 us | 31,923.6 us | 34,900.7 us |

FALLOC_FL_KEEP_SIZE, btrfs

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    391.2 us |   7.81 us |   7.31 us |    392.6 us |    380.1 us |    407.2 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 39,761.9 us | 788.62 us | 908.18 us | 39,622.3 us | 38,336.2 us | 41,728.7 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    389.1 us |   7.55 us |   6.70 us |    387.7 us |    381.0 us |    406.6 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 39,334.7 us | 767.07 us | 883.36 us | 39,216.6 us | 38,044.2 us | 41,221.2 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    398.1 us |   7.61 us |   8.15 us |    396.1 us |    388.9 us |    417.1 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 39,456.1 us | 536.55 us | 475.64 us | 39,562.6 us | 38,193.3 us | 40,012.5 us |

posix_fallocate, btrfs

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    404.0 us |   8.33 us |   9.26 us |    402.1 us |    389.6 us |    422.6 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 41,174.2 us | 815.69 us | 872.78 us | 41,158.3 us | 39,683.1 us | 42,629.2 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    422.7 us |   9.31 us |  10.72 us |    420.7 us |    407.0 us |    446.7 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 40,701.5 us | 779.55 us | 866.47 us | 40,710.0 us | 39,267.7 us | 42,277.2 us |

|                              Method |  fileSize | userBufferSize | options |        Mean |     Error |    StdDev |      Median |         Min |         Max |
|------------------------------------ |---------- |--------------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|
| Write_NoBuffering_PreallocationSize |   1048576 |          16384 |    None |    415.2 us |   9.12 us |  10.50 us |    413.8 us |    397.6 us |    438.0 us |
| Write_NoBuffering_PreallocationSize | 104857600 |          16384 |    None | 40,672.5 us | 762.44 us | 815.80 us | 40,560.3 us | 39,305.6 us | 42,116.1 us |

@jeffhandley jeffhandley added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Oct 4, 2021
@tmds
Copy link
Member

tmds commented Oct 5, 2021

I don't have any further action planned here.

If I had to take a long guess: maybe tmpfs is suffering some fragmentation issues?

I don't think there is much to improve about the implementation. We're making a single syscall with the appropriate flags.

I think we can consider to accept this as the new baseline. And maybe, add an additional test that uses a persistent file system.

@stephentoub
Copy link
Member

@tmds, so just to confirm:

  • There's still measurable benefit to specifying a preallocation size (i.e. we haven't somehow negated the benefit of the feature)
  • You can't repro the regression showing in the lab in this one specific case

Yes? Assuming that's the case, sounds like we can close this.

(And if there are more tests to be added, great.)

@tmds
Copy link
Member

tmds commented Oct 5, 2021

You can't repro the regression showing in the lab in this one specific case

Yes.

There's still measurable benefit to specifying a preallocation size (i.e. we haven't somehow negated the benefit of the feature)

No. Actually I can't measure the benefit of the posix_fallocate implementation either.

Maybe I'm not properly running these benchmarks.
@adamsitnik, can you run these benchmarks and share your findings?

@jozkee
Copy link
Member

jozkee commented Oct 5, 2021

There's still measurable benefit to specifying a preallocation size (i.e. we haven't somehow negated the benefit of the feature)

No. Actually I can't measure the benefit of the posix_fallocate implementation either.

This is the benchmark where we claim that PreallocationSize have a perf. impact.
https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/#preallocation-size

I don't see that there is something completely similar in https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/libraries/System.IO.FileSystem/Perf.FileStream.cs as that one directly uses RandomAccess.
Perhaps it may be worth giving it a try.

@carlossanlop
Copy link
Member

@jozkee The blog post benchmark you mentioned, which verifies PreallocationSize, was extracted from Perf.RandomAccess.cs, not from Perf.FileStream.cs. The method name is Write:
https://github.com/dotnet/performance/blob/34d64a5b905f35346e5f82cff1af88d502387ad0/src/benchmarks/micro/libraries/System.IO.FileSystem/Perf.RandomAccess.cs#L96-L108

@tmds @stephentoub In my Ubuntu WSL, I created two runtime cloned folders, synced each one of them to the commits that were mentioned in the description of this issue. Then I built them, ran the benchmarks against each clone, and compared the results (Perf_FileStream.Write_NoBuffering_PreallocationSize and Perf_RandomAccess.Write).

  • There is a 28% regression in Perf_FileStream.Write_NoBuffering_PreallocationSize for both test cases.
  • There is a 9% regression in Perf_RandomAccess.Write when fileSize is 1MB and bufferSize is 16KB.
  • The other Perf_RandomAccess.Write test case was faster by 5%.

I assume the reason why the bot did not report a perf regression for Perf_RandomAcces.Write is because the results fell under a predefined threshold? Is it of ~10%?

root@calopepc:/home/carlos/performance/src/tools/ResultsComparer# /usr/bin/dotnet run --base /home/carlos/perf_before/ --diff /home/carlos/perf_after/ --threshold 0.01%
summary:
better: 4, geomean: 1.063
worse: 3, geomean: 1.210
total diff: 7

| Slower                                                                           | diff/base | Base Median (ns) | Diff Median (ns) | Modality|
| -------------------------------------------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.IO.Tests.Perf_FileStream.Write_NoBuffering_PreallocationSize(fileSize: 10 |      1.28 |      33281000.00 |      42553016.67 |         |
| System.IO.Tests.Perf_FileStream.Write_NoBuffering_PreallocationSize(fileSize: 10 |      1.27 |        345208.19 |        437814.06 |         |
| System.IO.Tests.Perf_RandomAccess.Write(fileSize: 104857600, bufferSize: 16384,  |      1.09 |     132527975.00 |     144723425.00 | bimodal |

| Faster                                                                           | base/diff | Base Median (ns) | Diff Median (ns) | Modality|
| -------------------------------------------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.IO.Tests.Perf_RandomAccess.Write(fileSize: 1048576, bufferSize: 4096, opt |      1.05 |       2018950.00 |       1923881.60 | several?|

@adamsitnik
Copy link
Member

I was able to collect two trace files using perfcollect and identify the source of the regression using ext4 file system:

In case of posix_fallocate, the ext4_da_write_end method takes 12% of the total inclusive CPU time:

image

image

In case of fallocate, it's 26% and most of the extra time is spent in __mark_inode_dirty:

image

image

@tmds I've sent you the trace files. If you don't want to use PerfView (and Windows), you can unzip them, go to https://www.speedscope.app/, hit "Browse", select the "perf.data.txt" file and then at the top select the thread name, type "corerun" and select the first thread from corerun process:

image

@adamsitnik
Copy link
Member

When we were using posix_fallocate, we were on average gaining +-21% when writing to a file:

Method fileSize userBufferSize options Mean Ratio
Write_NoBuffering_PreallocationSize 1048576 16384 None 389.5 us 0.79
Write_NoBuffering_No_PreallocationSize 1048576 16384 None 491.8 us 1.00
Write_NoBuffering_PreallocationSize 104857600 16384 None 42,161.5 us 0.79
Write_NoBuffering_No_PreallocationSize 104857600 16384 None 53,329.2 us 1.00

after switching to fallocate we get 3-5%:

Method fileSize userBufferSize options Mean Ratio
Write_NoBuffering_PreallocationSize 1048576 16384 None 466.6 us 0.97
Write_NoBuffering_No_PreallocationSize 1048576 16384 None 480.6 us 1.00
Write_NoBuffering_PreallocationSize 104857600 16384 None 50,289.0 us 0.95
Write_NoBuffering_No_PreallocationSize 104857600 16384 None 52,824.2 us 1.00

@tmds
Copy link
Member

tmds commented Oct 11, 2021

In case of posix_fallocate, the ext4_da_write_end method takes 12% of the total inclusive CPU time:

This confirms your hypothesis the additional time is spent in updating the length after each write?

And when you add a SetLength, you're undoing the penalty because the length does no longer need updating.

When I run the benchmark on my system using persistent storage, I don't see a regression. I use btrfs, so it would have a much lower cost of updating the length than ext4.

Do you think we should change back to update the length when PreallocationSize is set?

@stephentoub
Copy link
Member

stephentoub commented Oct 11, 2021

Do you think we should change back to update the length when PreallocationSize is set?

I don't think so. From my perspective, the previous behavior is unintuitive and likely to be a bug farm, especially when it's inconsistent across Windows and Linux.

Is there still a meaningful benefit (e.g. reliability, some perf, etc.) to having the preallocationSize feature? If yes, we can just keep it as-is. If no, I think the change we should be considering is ripping it out entirely.

It seems like the full perf benefits can be restored simply by using SetLength after construction. This brings us back to my original questions around this feature, of why it's needed at all rather than someone just calling SetLength after construction. I think the answer I was given then was reliability, which suggests this feature really isn't about perf at all, it was just a side-benefit.

@adamsitnik
Copy link
Member

especially when it's inconsistent across Windows and Linux

After #58726 it was consistent and fast (I've verified that) for every OS. Now it's not consistent for the platforms that don't implement fallocate (like FreeBSD and WASM). That is a fact.

We don't know whether setting EOF would be confusing for the end-users. Some of us believe that it would be, some of use don't. Who should decide?

Is there still a meaningful benefit (e.g. reliability, some perf, etc.) to having the preallocationSize feature?

This feature was supposed to provide both reliability and perf. After recent changes (which got accepted and backported without performance validation) it depends on the target OS and File System.

we should be considering is ripping it out entirely.

I disagree.

It seems like the full perf benefits can be restored simply by using SetLength after construction. This brings us back to my original questions around this feature, of why it's needed at all rather than someone just calling SetLength after construction. I think the answer I was given then was reliability, which suggests this feature really isn't about perf at all, it was just a side-benefit.

The purpose of this feature was simple: create a file of certain size when the size is known up-front. Throw an exception when there is not enough space available or if given file system does not support files of such size. Why? To guarantee that subsequent
writes are guaranteed not to fail because of lack of disk space. Performance benefit was something important, as it was proven that it could be used in multiple places in BCL to provide some really nice perf wins. Example: #58167

@stephentoub
Copy link
Member

stephentoub commented Oct 11, 2021

We don't know whether setting EOF would be confusing for the end-users. Some of us believe that it would be, some of use don't. Who should decide?

Confusion isn't the issue. Files filled with garbage zeros is. These are the kinds of design decisions made in the name of performance that lead to long bug tails.

The purpose of this feature was simple: create a file of certain size when the size is known up-front.

How is SetLength insufficient for this? Don't your comments above suggest using SetLength makes the perf difference evaporate?

After recent changes (which got accepted and backported without performance validation) it depends on the target OS and File System.

Perf "validation" isn't the issue here, because the functionality was incorrect. When it comes to something being high performance and buggy, perf is irrelevant.

(I will also point out that Tom, a bonifide expert in Linux, has been unable to reproduce what you validated, so obviously the validation was limited to only a limited subset of systems.)

@adamsitnik
Copy link
Member

Files filled with garbage zeros is.

When you ask for a file of certain size to be created, what is the expected content?

Perf "validation" isn't the issue here, because the functionality was incorrect.

It behaved exactly as posix_fallocate which is part of the POSIX standard.

@stephentoub
Copy link
Member

stephentoub commented Oct 11, 2021

When you ask for a file of certain size to be created, what is the expected content?

This isn't calling SetLength. If it were, that is in fact asking for a file of a certain size, and it should be filled with zeros. Tthe original issue that motivated this be added in the first place described this as a "hint"; behavior that forces the file to that length is no longer a hint, and "preallocationSize" is a very poor name. If this were instead designed as "initialLength" and a requirement of the implementation rather than a hint, then sure, we would do exactly that, filled with zeros.

It behaved exactly as posix_fallocate which is part of the POSIX standard.

Ok. So I can change FileStream.Write to perform a read(...), and that's correct because it's part of the POSIX standard.

There's nothing in the POSIX standard that dicates what .NET APIs should or should not use specific functions. We choose what functionality to invoke based on what makes sense for the APIs we're exposing and implementing.

@adamsitnik
Copy link
Member

When you ask for a file of certain size to be created, what is the expected content?

This isn't calling SetLength.

I've asked about the expected behavior, not how it's supposed to be implemented or how does it relate to SetLength or any other existing methods.

"preallocationSize" is a very poor name

I agree. If I could turn back time I would call it initialFileSize or "guaranteedFileSize"

Ok. So I can change FileStream.Write to perform a read(...), and that's correct because it's part of the POSIX standard.

I prefer to follow existing standards and conventions rather that reinventing the wheel on my own. Is that a bad thing?

@stephentoub
Copy link
Member

stephentoub commented Oct 11, 2021

I've asked about the expected behavior, not how it's supposed to be implemented or how does it relate to SetLength or any other existing methods.

You asked about the expected behavior of "asking for a file of a certain size". And my answer is highlighting that "asking for a file of a certain size" is different from "preallocationSize"; if you actually asked for a file of a certain size, as you do with SetLength, then yes, it should be that length and be zero-filled. But that's not the behavior being named or requested.

I prefer to follow existing standards and conventions rather that reinventing the wheel on my own. Is that a bad thing?

It's not a bad thing, but it's cherry-picking. Windows exposes both options. Linux exposes both options. How are those any less existing? There are lots of things far from ideal about POSIX, which was standardized decades ago; we don't need to propagate mistakes or inadequacies of the past. There are valid reasons for posix_fallocate, and there are valid reasons other functionality exists in all of these operating systems. Just because posix_fallocate exists doesn't mean it's the right thing for this API.

I agree. If I could turn back time I would call it initialFileSize or "guaranteedFileSize"

We haven't shipped yet. We're late in the cycle, but if something new is broken, we should fix it. As it stands, it's called "preallocationSize", and that is not the same as behavior that forces a specific length and fills it with zeros. So we either ship the behavior currently in release/6.0 that matches the current release/6.0 "preallocationSize" naming / hint, we rename it and change the functionality accordingly, or we rip it out and revisit it in another release; those are the available options.

(I'm also still unclear as to why SetLength is insufficient.)

@tmds
Copy link
Member

tmds commented Oct 12, 2021

(I'm also still unclear as to why SetLength is insufficient.)

I have the same question. Based on the perf investigation, using SetLength is expected to have the same performance as using PreallocationSize from #58726 on Linux, and Windows. @adamsitnik can you confirm this?

So we either ship the behavior currently in release/6.0 that matches the current release/6.0 "preallocationSize" naming / hint, we rename it and change the functionality accordingly, or we rip it out and revisit it in another release; those are the available options.

This started as a request to improve performance when the file length is known up-front (#45946). I think that means we should rule out the first option because it doesn't improve performance on Linux.

Personally, I prefer to push this out of .NET 6 (option 3).

@jeffhandley
Copy link
Member

Personally, I prefer to push this out of .NET 6 (option 3 ["rip it out and revisit it in another release"]).

I'm onboard with that option as well. @tmds / @adamsitnik -- who between you two can create the PR for main to do this?

@adamsitnik
Copy link
Member

who between you two can create the PR for main to do this?

I am not going to do that because I don't agree with that.

@iSazonov
Copy link
Contributor

Conceptually this feature (PreallocationSize) is more engaging than SetLength for scenarios like rotated log or transaction files (which have a fixed size and it is not desirable for an application to worry about disk space for every io operation).
On the other hand, if a concept is controversial, it is better not to rush to publish it.

@carlossanlop
Copy link
Member

Let's summarize the status of this issue. We have 3 options:

  • Revert the latest change. The last state where everything was working exactly the same in all operating systems, was when Align preallocationSize behavior #58726 was merged. We were ensuring to preallocate the whole file, and we had a perf gain of ~20% when writing to a file of preallocated size when compared to also writing to a file but without setting the preallocation size up front.

  • Keep what we have. After we merged File preallocationSize: align Windows and Unix behavior. #59338, we no longer were offering the same behavior in all operating systems, and in fact, preallocation is not working at all in WASM and FreeBSD. We still had a perf gain, but of only ~3-5% when compared to writing to a file without setting the preallocation size up front. Note that this has already been backported to 6.0, and there were no benchmarks run against this change.

  • Remove the whole preallocation feature and revisit it in 7.0. Keep in mind that this feature has already been announced in the blog post, and removing it might break early adopters and cause confusion.

I am inclined to the first option: Revert the latest change, ensure all OS behave the same way, keep the big perf gains, and be very explicit in our documentation about using preallocationSize and warn them that EOF == preallocationSize.

I am very much against removing the whole feature.

@stephentoub
Copy link
Member

stephentoub commented Oct 12, 2021

Revert the latest change.

#58726 was in main; it never made it to release/6.0.

we had a perf gain of ~20% when writing to a file of preallocated size when compared to also writing to a file but without setting the preallocation size up front.

Performance is not the end-all-be-all here. If you wrote less than the "preallocated size", a term used to describe a hint about how the OS should structure things rather than about the actual file length, you'd end up erroneously with zeros at the end of the file.

Additionally, no one has answered the question about if/why SetLength is insufficient to reap similar perf benefits.

we no longer were offering the same behavior in all operating systems, and in fact, preallocation is not working at all in WASM and FreeBSD

PreallocationSize is a hint. It doesn't require the length of the file actually be what was specified. It's like FileOptions.RandomAccess or FileOptions.SequentialScan: it gives the OS information about how you intend to use the file but doesn't otherwise change functionality.

Keep in mind that this feature has already been announced in the blog post, and removing it might break early adopters and cause confusion.

And keep in mind that whatever we ship now will need to be maintained and supported forever. A few months of a preview for a new feature is nothing compared to having to live with the repercussions of it long-term if we ship the wrong thing. I'm speaking from experience here.

@adamsitnik
Copy link
Member

no one has answered the question about if/why SetLength is insufficient to reap similar perf benefits

SetLength is mapped to ftruncate:

internal static unsafe void SetFileLength(SafeFileHandle handle, long length) =>
CheckFileCall(Interop.Sys.FTruncate(handle, length), handle.Path);

It gives similar perf benefits, but it's 3-5% slower than posix_fallocate (for ext4)

Method fileSize userBufferSize Ratio
Write 1048576 16384 1.00
Write_fallocate 1048576 16384 0.98
Write_ftruncate 1048576 16384 0.84
Write_posix_fallocate 1048576 16384 0.81
Write 104857600 16384 1.00
Write_fallocate 104857600 16384 0.97
Write_ftruncate 104857600 16384 0.85
Write_posix_fallocate 104857600 16384 0.80

I don't know the implementation details, but my guess is that posix_fallocate has one job (it does not allow for shrinking files like ftruncate) and it's just optimized for doing it.

It seems like the full perf benefits can be restored simply by using SetLength after construction. This brings us back to my original questions around this feature, of why it's needed at all rather than someone just calling SetLength after construction. I think the answer I was given then was reliability, which suggests this feature really isn't about perf at all, it was just a side-benefit.

We wanted to have:

  • reliability: guarantee file size on every platform, make sure empty|incomplete files are not created and persisted when not enough space is available, ensure write operations will no longer fail due to running out of space since the space has already been reserved
  • best perf: because it's a frequent pattern that we could optimize for in multiple places like I did in File.*AllText* optimizations #58167 (up to 20% for File.*AllText* methods, we could get up to +20% in decompression as well)
  • good user experience: ignore the setting if the path points to a file that does not support it, make sure users don't need to implement it manually using SetLength and handle the edge cases on their own

a term used to describe a hint about how the OS should structure things rather than about the actual file length, you'd end up erroneously with zeros at the end of the file

It depends on how we understand the preallocationSize. Since it's a new feature and we own the docs I don't see us why we can't just describe what it does in XML docs and in the doc page similarly to what we did in the blog post:

https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/#preallocation-size

When PreallocationSize is specified, .NET requests the OS to ensure the disk space of a given size is allocated in advance. From a performance perspective, the write operations don’t need to extend the file and it’s less likely that the file is going to be fragmented. From a reliability perspective, write operations will no longer fail due to running out of space since the space has already been reserved.

Another option better than removing the feature is just renaming it from preallocationSize to initialFileSize or just fileSize (or any other name that we like)

@tmds
Copy link
Member

tmds commented Oct 12, 2021

It gives similar perf benefits, but it's 3-5% slower than posix_fallocate (for ext4)

It's good to know fallocate gives an additional performance gain compared to ftruncate. This tells us something additional can be gained from this API compared to SetLength.

I don't know the implementation details, but my guess is that posix_fallocate has one job (it does not allow for shrinking files like ftruncate) and it's just optimized for doing it.

The difference is that fallocate allocates, while ftruncate sets the length but is not guaranteed to allocate (ftruncate does not have the 'reliability' feature).

initialFileSize or just fileSize (or any other name that we like)

Maybe something with Length in it, FileLength?

Another option better than removing the feature

Both you and @carlossanlop reacted heavily against the idea of moving this feature out of .NET 6. To me, that is also a sensible, valid option.

@adamsitnik
Copy link
Member

We had an offline discussion where we all agreed that currently the best thing we can do is to accept the regression as by design and keep the current implementation.

preallocationSize is a hint, that does not modify EOF and is not a strong guarantee. Depending on the hardware, OS and File System, the gain of specifying it before writing to the file can be:

  • up 5% on Linux
  • up 15% on Windows.

With current design we don't risk data loss. For users who want event better perf, they can additionally call .SetLength.

We intend to update the docs and be very clear about it.

Thank you all for your input!

@stephentoub
Copy link
Member

Thanks, Adam.

@adamsitnik adamsitnik removed tenet-performance-benchmarks Issue from performance benchmark needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration arch-x64 labels Oct 13, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

9 participants