Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static compression helper methods #44793

Open
Tracked by #62658
GSPP opened this issue Nov 17, 2020 · 6 comments
Open
Tracked by #62658

Add static compression helper methods #44793

GSPP opened this issue Nov 17, 2020 · 6 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO.Compression
Milestone

Comments

@GSPP
Copy link

GSPP commented Nov 17, 2020

Problem:

Compressing data currently requires the use of streams. For example:

static byte[] Compress(byte[] data)
{
    using (var destinationStream = new MemoryStream())
    {
        using (var deflateStream = new DeflateStream(destinationStream, CompressionMode.Compress, CompressionLevel.BestCompression, leaveOpen: true))
        {
            deflateStream.Write(data);
        }

        return destinationStream.ToArray();
    }
}

byte[] compressed = Compress(data);

This code is alright. It could be more convenient, though:

byte[] compressed = DeflateStream.CompressData(data, CompressionLevel.BestCompression);

Benefits:

  1. This is more terse.
  2. It's an expression as opposed to a statement. That makes it more composable. It can be a subexpression, for example, in a LINQ query.
  3. No way to forget resource cleanup or mess up code quality otherwise.
  4. Optimizations can be done under the covers.
  5. It is more beginner-friendly than first learning the mechanics of streams.

There is a related proposal that addresses this in a different way: #39327 That proposal seems to be about avoiding allocations while this proposal is more about improving convenience.


Proposal:

Use this pattern for all built-in compression types:

public partial class DeflateStream
{
    public static byte[] CompressData(ReadOnlySpan<byte> source, CompressionLevel compressionLevel = CompressionLevel.Default) => throw null;
    public static byte[] DecompressData(ReadOnlySpan<byte> source) => throw null;
}

It does not get more convenient than this. That is the primary goal.

@GSPP GSPP added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Nov 17, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.IO.Compression untriaged New issue has not been triaged by the area owner labels Nov 17, 2020
@huoyaoyuan
Copy link
Member

ReadOnlySpan could face with lifetime issues when sharing internal implementation. ReadOnlyMemory should be better.

@carlossanlop
Copy link
Member

Things to consider:

  • The advantage of adding such a helper is that we could ensure the compression task is done properly.
  • This suggested API is limited to in-memory compression. Besides a method that takes a byte buffer, Should we consider also APIs that involve compressing files? Should we extend such helpers to other compression classes?
  • I wonder if adding an API to a stream-based class is the right approach. Maybe we could think of an extensions class instead, like ZipFileExtensions.

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Jan 19, 2021
@carlossanlop carlossanlop added this to the Future milestone Jan 19, 2021
@xoofx
Copy link
Member

xoofx commented Sep 25, 2021

There is also some past discussions at #2236 for reference.

Though, I'm not in favor of the proposal above, for the reason that the API public static byte[] CompressData(ReadOnlySpan<byte> source, ... is not efficient (allocates always) and doesn't reflect how most native API work where the source/dest buffer and sizes are given by the client, so that you have a lot more control when you want to allocate, how you want to maintain this pool for the buffers... (See for example LZ4_decompress_safe or LZ4_compress_fast)

I would prefer that we follow what was suggested by @stephentoub in this issue comment with low level encoders taking a Span-in/Span-out but I'm also more interested of having raw level wrapper of compression libraries that can expose the specificities of each (e.g compression level between libraries can be different, level vs time can be an important factor as well) so that we can build higher level richer API (like the encoders or specific scenarios).

@FiniteReality
Copy link

Is this not a dupe of #39327 ?

@GSPP
Copy link
Author

GSPP commented Sep 26, 2021

@FiniteReality #39327 does not have static methods. I think that's important to reach the best convenience possible.

@ycrumeyrolle
Copy link

There is also some past discussions at #2236 for reference.

Though, I'm not in favor of the proposal above, for the reason that the API public static byte[] CompressData(ReadOnlySpan<byte> source, ... is not efficient (allocates always) and doesn't reflect how most native API work where the source/dest buffer and sizes are given by the client, so that you have a lot more control when you want to allocate, how you want to maintain this pool for the buffers... (See for example LZ4_decompress_safe or LZ4_compress_fast)

I would prefer that we follow what was suggested by @stephentoub in this issue comment with low level encoders taking a Span-in/Span-out but I'm also more interested of having raw level wrapper of compression libraries that can expose the specificities of each (e.g compression level between libraries can be different, level vs time can be an important factor as well) so that we can build higher level richer API (like the encoders or specific scenarios).

I do prefer also a low level API, something like that:

public partial class Deflate
{
    public static OperationStatus CompressData(ReadOnlySpan<byte> data, Span<byte> compressedData, CompressionLevel compressionLevel = CompressionLevel.Default) => throw null;
    public static void CompressData(ReadOnlySpan<byte> data, IBufferWriter<byte> bufferWritter, CompressionLevel compressionLevel = CompressionLevel.Default) => throw null;

    public static void DecompressData(ReadOnlySpan<byte>compressedData, IBufferWriter<byte> bufferWritter) => throw null;
    public static OperationStatus DecompressData(ReadOnlySpan<byte>compressedData, Span<byte> destination, out int bytesConsumed, out int bytesWritten) => throw null;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO.Compression
Projects
None yet
Development

No branches or pull requests

8 participants