Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2518 #2519

Merged
merged 6 commits into from
Sep 2, 2023
Merged

Fix #2518 #2519

merged 6 commits into from
Sep 2, 2023

Conversation

antonfirsov
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

Applying a (hopefully correct) bin buffer size, and eliminating Unsafe usage in a less critical processor we did not attempt to optimize with other safer techniques.

Benchmarks

main

|     Method |     Mean |   Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |---------:|--------:|--------:|------:|------:|------:|----------:|
| DoOilPaint | 201.7 ms | 4.01 ms | 5.07 ms |     - |     - |     - |     18 KB |

PR

|     Method |     Mean |   Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |---------:|--------:|--------:|------:|------:|------:|----------:|
| DoOilPaint | 214.5 ms | 3.61 ms | 3.20 ms |     - |     - |     - |     18 KB |

I believe the 6% cost is worth the readability and memory safety benefits. We should keep Unsafe for the hot path in critical stuff (codecs, resize etc.). If we want to optimize this processor, we can try other techniques first, eg. vectorization.

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this. Agreed regarding the indexers.

How did you derive 256 as a multiplier?

@@ -97,19 +98,19 @@ public void Invoke(in RowInterval rows)
* are cleared for each loop iteration, to avoid the repeated allocation for each processed pixel. */
using IMemoryOwner<Vector4> sourceRowBuffer = this.configuration.MemoryAllocator.Allocate<Vector4>(this.source.Width);
using IMemoryOwner<Vector4> targetRowBuffer = this.configuration.MemoryAllocator.Allocate<Vector4>(this.source.Width);
using IMemoryOwner<float> bins = this.configuration.MemoryAllocator.Allocate<float>(this.levels * 4);
using IMemoryOwner<float> bins = this.configuration.MemoryAllocator.Allocate<float>(this.levels * 256 * 4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be clamping this in the extension method? I'm trying to wrap my head around how this actually works again.

@JimBobSquarePants
Copy link
Member

Here's the article it was based upon.

https://softwarebydefault.com/2013/06/29/oil-painting-cartoon-filter/

@antonfirsov
Copy link
Member Author

antonfirsov commented Aug 25, 2023

Yeah, according to the article we should divide by 255 to get the intensity bin index instead of defining 255x more buckets.

int currentIntensity = (int)MathF.Round((sourceBlue + sourceGreen + sourceRed) / 3F * this.levels / 255f);

Originally:

currentIntensity = (int )Math.Round(((double) 
                               (pixelBuffer[calcOffset] + 
                               pixelBuffer[calcOffset + 1] + 
                               pixelBuffer[calcOffset + 2]) / 3.0 * 
                               (levels)) / 255.0); 

But with this tests are failing and the output looks like a simple blur. Unfortunately I cannot go deeper understanding right now.

@JimBobSquarePants
Copy link
Member

I’ll have a look tonight

@@ -140,29 +142,29 @@ public void Invoke(in RowInterval rows)
int offsetX = x + fxr;
offsetX = Numerics.Clamp(offsetX, 0, maxX);

Vector4 vector = sourceOffsetRow[offsetX].ToVector4();
Vector4 vector = Vector4.Clamp(sourceOffsetRow[offsetX].ToScaledVector4(), Vector4.Zero, Vector4.One);
Copy link
Member Author

@antonfirsov antonfirsov Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, ToScaledVector4 doesn't guarantee that the vector will be clamped to 0..1 since a pixel implementation is free to return anything. (In fact RgbaVector doesn't clamp!)

With clamping I was able to undo the buffer overallocation in 99771d6. (Which was in fact ad-hoc since RgbaVector can hold an arbitrary high value, way above 255.)

I think the PR should be good now if you are also fine with the changes @JimBobSquarePants .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t we fix RgbaVector? ScaledVector is supposed to guarantee 0…1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a look at several of our pixel formats. None of the sanitize the range in construction. We should fix that across the board.

Copy link
Member Author

@antonfirsov antonfirsov Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the sanitize the range in construction

I wonder which are the pixel types where ToScaledVector4() could go out of range? To me it looks like it can happen only with floating point ones (RgbaVector, HalfVector2|4, HalfSingle ...) Do you see any other ones?

Sanitizing in the constructor is insufficient since (1) there are public fields/properties to manipulate the values (2) low-level MemoryMarshal.Cast-like manipulation is completely normal when working with pixel buffers and it can also push values out of range. If we want a guarantee that ToScaledVector4() implementations always return values in 0..1, clipping should happen in ToScaledVector4().

However, even if guarantee this in our own pixel types, arbitrary user pixel types can still potentially violate the contract. If it only leads to an IndexOutOfRangeException, we can blame it on the user. However, if we have unsafe indexing depending on the the values returned from ToScaledVector4() (like we used to do in OilPaint) we have a serious bug in the processor. We may have a similar problem here:

var vector = Unsafe.Add(ref vectorRef, (uint)x);
int luminance = ColorNumerics.GetBT709Luminance(ref vector, levels);
float scaledLuminance = Unsafe.Add(ref cdfBase, (uint)luminance) / noOfPixelsMinusCdfMin;

My recommendation is to revisit these processors before doing anything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the pixel formats accepting a float lack sanitation so Short2, Short4, and the NormalizedXX variants would be on the list also.

Scaled vector shouldn't clip. It should normalize the scale of the min max values of the pixel format. The mutability of the pixel formats makes sanitation difficult but as you say, avoiding unnecessary unsafe code turns a trivial exception to a serious one so we should definitely review and fix any processors that could be affected.

Regarding this PR. That clamp is going to hurt performance and should be unnecessary given the correct Vector4 range. I'd say we revert it and merge before continuing to sanitize the processors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See ac0c1f9. With aéé the confusion this PR started with, it would make sense to squash on merge.

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this will do for now until we figure out a way to clamp pixel ranges in an efficient manner.

@JimBobSquarePants JimBobSquarePants merged commit 54b7e04 into main Sep 2, 2023
7 checks passed
@JimBobSquarePants JimBobSquarePants deleted the fix-2518 branch September 2, 2023 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants