Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Stream.HasLength property #43448

Closed
Sergio0694 opened this issue Oct 15, 2020 · 13 comments
Closed

Proposal: Stream.HasLength property #43448

Sergio0694 opened this issue Oct 15, 2020 · 13 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Milestone

Comments

@Sergio0694
Copy link
Contributor

Sergio0694 commented Oct 15, 2020

Description

There currently is no reliable way to know whether Stream.Length can be used, as checking CanSeek only guarantees that Length will work if it returns true, but gives no guarantees if it returns false: there can be cases where CanSeek is false but Length works just fine. This means that the only way to check the Length is to use a try/catch block, which is not ideal. Being able to access the length allows for a number of optimizations, such as renting a buffer the size of the whole stream, to read its contents in a single pass without wasting cycles expanding the buffer to write to, and making more copies and reads.

Proposed API:

namespace System.IO
{
    public abstract class Stream
    { 
        public abstract bool HasLength { get; }
    }
}

Original description

The Stream returned by HttpContent.ReadAsStreamAsync sometimes has CanSeek set to false, but accessing the Length property works just fine. This seems to directly contradict the docs (here):

If a class derived from Stream does not support seeking, calls to Length, SetLength, Position, and Seek throw a NotSupportedException.

This remark implies that !CanSeek ==> Length throws, but this is not always true in practice. To work around this, I ended up using an ugly try/catch block in the System.Text.Json-based JSON serializer in the refit library: here. As mentioned above, not really sure how to classify this issue between a bug report and a feature proposal, as it depends on whether or not this is intended behavior. As in, I would say there are two possibilities here:

  • If the remarks in the docs are correct and !CanSeek ==> Length throws, then it seems like there is a bug in HttpContent.
  • If the remarks are not correct, then:
    • We should update them
    • I would ask to add a new Stream.HasLength property (this was actually @tannergooding's idea), that will allow developers to just check whether Length can be accessed, regardless of whether or not the whole stream is actually seekable. This is particularly useful whenever you'd like to just allocate a buffer to read the whole stream in memory in one go, without having to use a buffer writer which will likely result in less efficient code due to buffer expansions and copy operations.

Related issues/PRs

Configuration

  • OS: Windows 10
  • Version: .NET Core 3.1.2
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Oct 15, 2020
@ghost
Copy link

ghost commented Oct 15, 2020

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@stephentoub
Copy link
Member

Instead of calls to Length, SetLength, Position, and Seek throw a NotSupportedException, the docs should really say calls to Length, SetLength, Position, and Seek may throw a NotSupportedException. In other words, CanSeek is there to govern whether you can expect Length/SetLength/Position/Seek to work: if it returns true, they should work, but if it returns false, there aren't any guarantees made one way or the other.

@Sergio0694
Copy link
Contributor Author

Thank you for clarifying, I'll go open a PR in the docs repo then! 😄

Do you reckon a separate Stream.HasLength property would be a reasonable API to add?
Should I just edit this issue to turn it into a proposal for that?

@stephentoub
Copy link
Member

stephentoub commented Oct 15, 2020

Do you reckon a separate Stream.HasLength property would be a reasonable API to add?

It's certainly fine to have a discussion about, and this issue can be used for that. We do want to be very careful about any new virtuals we add to Stream and that they really add enough value to be worth their weight, and need to think through what the default implementations would be. For example, Stream.HasLength's base implementation would probably need to be HasLength => CanSeek, which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

@Sergio0694
Copy link
Contributor Author

For example, Stream.HasLength's base implementation would probably need to be HasLength => CanSeek, which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

Right, yeah that'd be inconvenient 🤔
I would imagine at least all the BCL streams would support it properly from the start, but yeah that's a very good point.
I'll update the title/description of the issue and leave it open for future discussion then, thank you for your time Stephen! 😊

@Sergio0694 Sergio0694 changed the title Inconsistent Stream.CanSeek/Length behavior in HttpContent Proposal: Stream.HasLength property Oct 15, 2020
@tannergooding
Copy link
Member

which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

It could also do a try/catch and return `true if the length is not negative, not zero, or if it succeeds at all. Which is more correct, just slow for the default implementation.

@jkotas
Copy link
Member

jkotas commented Oct 15, 2020

Related: Length == 0 means "unknown length" for some streams today.

If you run this on Linux today:

var fileStream = new FileStream("/proc/self/stat", FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 1, useAsync: false);
Console.WriteLine(fileStream.CanSeek);
Console.WriteLine(fileStream.Length);
Console.WriteLine(File.ReadAllBytes("/proc/self/stat").Length);

The output is:

True
0
324

Would this API account for this?

@carlossanlop
Copy link
Member

@jkotas the main description is accounting for the case you shared:

CanSeek only guarantees that Length will work if it returns true

Not only CanSeek returned true, but also Length didn't throw an exception, and returned a valid value, zero, because it's a special file. HasLength should return true for that case.

@jozkee jozkee added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed untriaged New issue has not been triaged by the area owner labels Nov 13, 2020
@jozkee jozkee added this to the Future milestone Nov 13, 2020
@jkotas
Copy link
Member

jkotas commented Nov 14, 2020

Right, it means that there is no good way to provide base implementation that works well. How are people calling this API going to figure out whether they are calling base implementation that may return bogus result; or actual implementation that will be correct?

@jkotas
Copy link
Member

jkotas commented Nov 14, 2020

FWIW, I do not think that the proposal in its current form is ready for review.

@stephentoub
Copy link
Member

I agree. As it stands, we shouldn't add this.

@stephentoub stephentoub added api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Nov 14, 2020
@SimonCropp
Copy link
Contributor

SimonCropp commented Sep 13, 2023

i hit this today. so a http stream that is not seekable and length throws. went to work around it by copying it to a memory stream. and then CopyToAsync throws since internally it tries to use Length to calculate the buffer size

@adamsitnik
Copy link
Member

Right, it means that there is no good way to provide base implementation that works well.

I agree. Since we have not found a way to do that for a few years I am going to close the issue.

@adamsitnik adamsitnik closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Nov 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Projects
None yet
Development

No branches or pull requests

10 participants