Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamReader.Peek() on a DeflateStream returns -1 early #68983

Closed
Austin-Lamb opened this issue May 6, 2022 · 10 comments · Fixed by #89609
Closed

StreamReader.Peek() on a DeflateStream returns -1 early #68983

Austin-Lamb opened this issue May 6, 2022 · 10 comments · Fixed by #89609
Assignees
Milestone

Comments

@Austin-Lamb
Copy link

Description

I have code that has a DeflateStream and uses a StreamReader on it - this code works fine in .NET Framework, .NET Core 3.1, and .NET 5 - but it breaks in .NET 6.

When trying to Peek() on the stream, internally StreamReader reads a block out that ends up being smaller than it requested, I'm pretty sure this is due to this breaking change. After that happens, it sets a flag called "_isBlocked" and starts returning -1 from Peek() which is not expected.

This seems like a bug in StreamReader?

Reproduction Steps

Construct a StreamReader over a DeflateStream
Read for a while, peeking as you go

Expected behavior

Peek() returns -1 only at the end of the stream, as it did before

Actual behavior

Peek() returns -1 early in the middle of the file.

Regression?

Yes, regressed in .NET 6

Known Workarounds

Can use StreamReader.EndOfStream instead of looking for -1 from Peek(), but this was still a difficult to hunt down bug on upgrading the .NET version of this library.

Configuration

No response

Other information

No response

@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 6, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO and removed untriaged New issue has not been triaged by the area owner labels May 6, 2022
@ghost
Copy link

ghost commented May 6, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

I have code that has a DeflateStream and uses a StreamReader on it - this code works fine in .NET Framework, .NET Core 3.1, and .NET 5 - but it breaks in .NET 6.

When trying to Peek() on the stream, internally StreamReader reads a block out that ends up being smaller than it requested, I'm pretty sure this is due to this breaking change. After that happens, it sets a flag called "_isBlocked" and starts returning -1 from Peek() which is not expected.

This seems like a bug in StreamReader?

Reproduction Steps

Construct a StreamReader over a DeflateStream
Read for a while, peeking as you go

Expected behavior

Peek() returns -1 only at the end of the stream, as it did before

Actual behavior

Peek() returns -1 early in the middle of the file.

Regression?

Yes, regressed in .NET 6

Known Workarounds

Can use StreamReader.EndOfStream instead of looking for -1 from Peek(), but this was still a difficult to hunt down bug on upgrading the .NET version of this library.

Configuration

No response

Other information

No response

Author: Austin-Lamb
Assignees: -
Labels:

area-System.IO

Milestone: -

@stephentoub
Copy link
Member

Can you please share a small, isolated repro?

@aromaa
Copy link
Contributor

aromaa commented May 6, 2022

Sounds like this could be #61325 (comment).

@adamsitnik adamsitnik added the needs-author-action An issue or pull request that requires more info or actions from the author. label May 9, 2022
@ghost
Copy link

ghost commented May 9, 2022

This issue has been marked needs-author-action and may be missing some important information.

@Austin-Lamb
Copy link
Author

This comes from a large project, I will see if I can extract a minimal repro - it may take me a couple days.

@ghost ghost added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels May 9, 2022
@Austin-Lamb
Copy link
Author

@stephentoub, I have made a really small repro. Take this code:

using System;
using System.IO;
using System.IO.Compression;
using System.Net;
using System.Runtime.InteropServices;

Console.WriteLine($"Running in {RuntimeInformation.FrameworkDescription}");

var localPath = Path.GetTempFileName();
new WebClient().DownloadFile(@"https://github.com/dotnet/runtime/archive/refs/tags/v6.0.5.zip", localPath);

var numChars = 0;

using var archive = new ZipArchive(File.OpenRead(localPath), ZipArchiveMode.Read);
foreach (var entry in archive.Entries)
{
    using var sr = new StreamReader(entry.Open());
    
    while (sr.Peek() != -1)
    {
        char next = (char)sr.Read();
        numChars++;
    }
}

Console.WriteLine($"Read {numChars} characters in {archive.Entries.Count} zip entries");

When run on .NET 5 vs. 6, I get different numbers of characters read because Peek() returns -1 "early" sometimes in .NET 6, probably due to the breaking change I mentioned in the original post in this issue. I get this output:

Running in .NET 5.0.17
Read 878322628 characters in 69097 zip entries

vs.

Running in .NET 6.0.5
Read 645700952 characters in 69097 zip entries

@svick
Copy link
Contributor

svick commented May 11, 2022

Here is a self-contained repro of this behavior:

using System;
using System.IO;

using var stream = new AStream();
using var reader = new StreamReader(stream);

// prints "42" once
while (reader.Peek() != -1)
{
    Console.WriteLine(reader.Read());
}

class AStream : Stream
{
    public override int Read(byte[] buffer, int offset, int count)
    {
        buffer[offset] = 42;
        return 1;
    }

    public override bool CanRead => true;

    public override bool CanSeek => true;

    public override bool CanWrite => throw new NotImplementedException();

    public override long Length => throw new NotImplementedException();

    public override long Position { get => throw new NotImplementedException(); set => throw new NotImplementedException(); }

    public override void Flush() => throw new NotImplementedException();

    public override long Seek(long offset, SeekOrigin origin) => throw new NotImplementedException();

    public override void SetLength(long value) => throw new NotImplementedException();

    public override void Write(byte[] buffer, int offset, int count) => throw new NotImplementedException();
}

Note that the documentation of Peek leaves me confused. It says that -1 can also be returned if the stream does not support seeking, but that would make Peek much less useful and it's not what the implementation does.

@stephentoub
Copy link
Member

Thanks for the repros. I'll take a look...

@stephentoub stephentoub self-assigned this May 11, 2022
@stephentoub
Copy link
Member

StreamReader has a private _isBlocked field that it's using to track whether the last read on the underlying Stream returned fewer bytes than requested:

// Whether the stream is most likely not going to give us back as much
// data as we want the next time we call it. We must do the computation
// before we do any byte order mark handling and save the result. Note
// that we need this to allow users to handle streams used for an
// interactive protocol, where they block waiting for the remote end
// to send a response, like logging in on a Unix machine.
private bool _isBlocked;

It then uses this field in various ReadXx methods to determine whether to try to read more data from the underlying Stream or to just give back to the caller just the non-zero amount it gathered thus far. I think that's a misguided design but at least defensible. However, it's then also using that _isBlocked in Peek():

which while potentially "by design" is to me a flawed design.

I think the fix here at a minimum is to change the semantics of Peek to stop looking at _isBlocked at all.

@adamsitnik?

@stephentoub stephentoub removed their assignment May 12, 2022
@stephentoub stephentoub added this to the 7.0.0 milestone May 12, 2022
@mpervovskiy
Copy link

This becomes even worse with Http streams as Peek() is unpredictable there... and looks like this issue dwells in StreamReader for ages.
Repo:

using System;
using System.IO;
using System.Net.Http;

namespace ConsoleApp
{
    class Program
    {
        public static void Main(string[] args)
        {
            var client = new HttpClient();
            var stream = client.GetStreamAsync("https://en.wikipedia.org/wiki/Software_bug").Result;
            var reader = new StreamReader(stream);
            var r = 0;
            var i = -1;
            while (r >= 0)
            {
                i++;
                var b = reader.Peek();
                r = reader.Read();
                if (r != b)
                {
                    Console.WriteLine($"difference {i,6}: {r} {b}");
                }
            }

            Console.WriteLine("done");
        }
    }
}

Differences are happening at different places in the stream

FerdiMeijer pushed a commit to FerdiMeijer/speckle-sharp that referenced this issue Jul 5, 2022
Replaced Peek() function because of issue that occurs only in .net6.0 dotnet/runtime#68983 when downloading large object (e.g. 8000+ child objects)
@jeffhandley jeffhandley modified the milestones: 7.0.0, 8.0.0 Jul 9, 2022
teocomi pushed a commit to specklesystems/speckle-sharp that referenced this issue Jul 12, 2022
Replaced Peek() function because of issue that occurs only in .net6.0 dotnet/runtime#68983 when downloading large object (e.g. 8000+ child objects)

Co-authored-by: Jedd Morgan <45512892+JR-Morgan@users.noreply.github.com>
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 27, 2023
@jozkee jozkee removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jul 28, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 29, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants