Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebP - Reduce the allocations in lossless encoding #2546

Merged
merged 10 commits into from
Nov 6, 2023

Conversation

JimBobSquarePants
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR refactors the Vp8LHistogram type to use pooled allocations. It also introduces the concept of a Vp8LHistogramSet type to allow allocating a single buffer for a collection.

Benchmark indicates a 20ms speedup for lossless encoding while using less than half the original allocated memory.

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.22621
11th Gen Intel Core i7-11370H 3.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=8.0.100-rc.1.23455.8
  [Host]     : .NET 6.0.22 (6.0.2223.42425), X64 RyuJIT
  Job-ZADQXC : .NET 6.0.22 (6.0.2223.42425), X64 RyuJIT

Runtime=.NET 6.0  Arguments=/p:DebugType=portable  IterationCount=3
LaunchCount=1  WarmupCount=3

Main

Method TestImage Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
'Magick Webp Lossy' Png/Bike.png 22.30 ms 5.533 ms 0.303 ms 0.14 0.00 - - - 68 KB
'ImageSharp Webp Lossy' Png/Bike.png 74.82 ms 2.291 ms 0.126 ms 0.46 0.00 2428.5714 142.8571 - 15,187 KB
'Magick Webp Lossless' Png/Bike.png 161.16 ms 22.093 ms 1.211 ms 1.00 0.00 - - - 518 KB
'ImageSharp Webp Lossless' Png/Bike.png 249.44 ms 58.373 ms 3.200 ms 1.55 0.02 8000.0000 4000.0000 2000.0000 46,532 KB

PR

Method TestImage Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
'Magick Webp Lossy' Png/Bike.png 22.20 ms 5.392 ms 0.296 ms 0.13 0.00 - - - 68 KB
'ImageSharp Webp Lossy' Png/Bike.png 76.90 ms 61.754 ms 3.385 ms 0.46 0.01 2428.5714 142.8571 - 15,187 KB
'Magick Webp Lossless' Png/Bike.png 165.44 ms 85.798 ms 4.703 ms 1.00 0.00 - - - 518 KB
'ImageSharp Webp Lossless' Png/Bike.png 228.72 ms 40.127 ms 2.199 ms 1.38 0.05 3000.0000 2000.0000 2000.0000 20,623 KB


OptimizeHistogramSymbols(clusterMappings, numClusters, mapTmp, histogramSymbols);
}

float x = quality / 100.0f;
float x = quality / 100F;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
float x = quality / 100F;
float x = quality * 0.01F;

Division is more expensive than multiplication.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to check again if this is on a hot path.

if (p.CostDiff >= 0.0d)
HistoListUpdatePair(histograms[p.Idx1], histograms[p.Idx2], stats, bitsEntropy, 0D, p);

if (p.CostDiff >= 0D)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: 0D before the literals were lowercase which is IMO more readable (0d <-> 0D).
Anymway it should be consistent throughout the code-base.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an irrational dislike for lowercase suffix type identifiers for F, D, L, and M. I cannot explain why as I use u at times.

{
this.bufferOwner = bufferOwner;
this.buffer = buffer;
this.bufferHandle = this.buffer.Pin();
Copy link
Member

@antonfirsov antonfirsov Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm the author of the code, but I think this needs further restructuring. The problem is that individual pinning of the sub-buffers forces the original big IMemoryOwner<uint> (which is in fact an UnmanagedBuffer<T> : MemoryManager<T>) to increase the big buffers ref count Vp8LHistogram.Count times, then decrease it on each dispose. This is unnecessary fake bookeeping that only adds a risk of introducing memory leaks. We should instead let Vp8HistogramSet do the pinning, so we take an uint* in the constructor.

For Vp8HistogramSet members this would turn Dispose into a NOP and DisposeAt(idx) to an unnecessary method. For better clarity, I would split Vp8LHistogram into multiple types so Dispose is only defined on instances which are actually disposable.

abstract class Vp8LHistogram
{
    // Hide the ctr
    protected Vp8LHistogram(uint* basePointer);
}
sealed class OwnedVp8LHistogram : Vp8LHistogram, IDisposable
{
    // Static Create methods
}
class Vp8LHistogramSet
{
    // Use the hidden ctr within Vp8LHistogramSet code only.
    private class MemberHistogram : Vp8LHistogram { }
    
    // !!! Delete DisposeAt()
}

I can push this change later, after finishing higher priority contributions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually trickier than it looks. Pinning the buffer to pass the pointer is proving troublesome.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured it out.

// Copyright (c) Six Labors.
// Licensed under the Six Labors Split License.

#nullable disable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that List<Vp8LHistogram> items is in fact a List<Vp8LHistogram?> items, it would be nice to enable this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I only disabled it and make the items generic type nullable as I didn't want to introduce additional checks in callers.

Copy link
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would prefer to wait with merging with this, in case we need to backport some webp enhancements.

src/ImageSharp/Formats/Webp/Lossless/HistogramEncoder.cs Outdated Show resolved Hide resolved
bool[] bufRle = new bool[maxNumSymbols];
Span<HuffmanTree> huffTree = stackalloc HuffmanTree[3 * maxNumSymbols];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Co-authored-by: Anton Firszov <antonfir@gmail.com>
@JimBobSquarePants
Copy link
Member Author

@antonfirsov I want to get this merged so it's not blocking #2569

None of the changes here represent security enhancements so there's nothing to backport. Any future ones can be manually handled.

@JimBobSquarePants JimBobSquarePants merged commit 7e1ee9e into main Nov 6, 2023
8 checks passed
@JimBobSquarePants JimBobSquarePants deleted the js/webp-allocations branch November 6, 2023 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants