Refactor hflip, vflip functions to allow preallocation #117

emilmgeorge · 2024-09-01T18:33:02Z

For #103

crates/kornia-imgproc/src/flip.rs

emilmgeorge · 2024-09-02T05:08:05Z

Fixed formatting.

crates/kornia-imgproc/src/flip.rs

emilmgeorge · 2024-09-10T16:34:21Z

I benchmarked horizontal_flip with the following code variations:

par_par_slicecopy => Row: parallel Iter -- Column: Parallel Iter -- Channel: SliceCopy
par_loop_loop => Row: parallel Iter -- Column: HalfLoop -- Channel: Loop
par_loop_slicecopy => Row: parallel Iter -- Column: HalfLoop -- Channel: SliceCopy
par_seq_slicecopy => Row: parallel Iter -- Column: Normal (Sequential) Iter -- Channel: SliceCopy
kornia => (The first code in this PR with the full loop) Row: parallel Iter -- Column: FullLoop -- Channel: Loop

The general trend is that (4) usually comes out on top, followed closely by (3), and then (2), (5) and (1) at the last. In case of larger images (1024x896), the difference between the (4), (3) and (2) is usually small and inconsistent between runs.

Criterion.rs Line chart (open in new tab if in dark mode):

The average time value from the last 1024x896 run are:
(1) 1.3298 ms
(2) 1.2114 ms
(3) 1.2118 ms
(4) 1.2062 ms
(5) 1.2484 ms

Here's the full report generated by criterion:
criterion.zip

Please let me know your thoughts on this or if you have any other ideas to compare.

edgarriba · 2024-09-10T20:03:47Z

let's keep par_seq_slicecopyas it uses iterators and speed perf seems reasonable ?

emilmgeorge · 2024-09-11T06:56:17Z

Done.

edgarriba · 2024-09-11T07:14:36Z

Merging this, actually would be great as a post pr, to verify against known libraries like image-rs

emilmgeorge · 2024-09-11T15:23:28Z

I tried including image_rs in the benchmark (similar to how it is in bench_resize.rs), but my PC (16GB RAM) does not have enough memory for the full test. It gets terminated by the linux OOM killer before the 1024x896 test. The resize benchmark also crashes midway on my PC.

For 256x224 and 512x448, image-rs performs much worse (eg. 1.0548 ms for image_rs vs 161.53 us for par_seq_slicecopy).

For ref, here's the patch (Let me know if you want me to include it in this PR):
0001-Add-image_rs-to-the-flip-benchmark.patch.txt

emilmgeorge commented Sep 1, 2024

View reviewed changes

crates/kornia-imgproc/src/flip.rs Outdated Show resolved Hide resolved

emilmgeorge force-pushed the main branch from 9fcf490 to fac637c Compare September 2, 2024 04:40

edgarriba requested changes Sep 2, 2024

View reviewed changes

crates/kornia-imgproc/src/flip.rs Outdated Show resolved Hide resolved

crates/kornia-imgproc/src/flip.rs Show resolved Hide resolved

crates/kornia-imgproc/src/flip.rs Outdated Show resolved Hide resolved

crates/kornia-imgproc/src/flip.rs Outdated Show resolved Hide resolved

emilmgeorge force-pushed the main branch from fac637c to f5eda2d Compare September 10, 2024 16:27

emilmgeorge requested a review from edgarriba September 10, 2024 16:35

emilmgeorge added 6 commits September 11, 2024 12:29

Refactor horizontal_flip fn to allow preallocation

3d9beb6

Refactor vertical_flip fn to allow preallocation

514f537

Make hflip, vflip tests 3 channel

f170be2

Review comment 1: image.size()

ab47458

Add benchmarking for imgproc::flip

d1cf48d

Update hflip, vflip based on benchmarks

ecb3e4f

emilmgeorge force-pushed the main branch from 64a4d82 to ecb3e4f Compare September 11, 2024 07:03

edgarriba approved these changes Sep 11, 2024

View reviewed changes

edgarriba merged commit c7fafef into kornia:main Sep 11, 2024
9 checks passed

edgarriba mentioned this pull request Sep 19, 2024

Update hflip and vflip with mutable signatures #103

Closed

emilmgeorge mentioned this pull request Sep 23, 2024

Fix TensorStorage memory deallocation #145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor hflip, vflip functions to allow preallocation #117

Refactor hflip, vflip functions to allow preallocation #117

emilmgeorge commented Sep 1, 2024

emilmgeorge commented Sep 2, 2024

emilmgeorge commented Sep 10, 2024

edgarriba commented Sep 10, 2024

emilmgeorge commented Sep 11, 2024

edgarriba commented Sep 11, 2024 •

edited

Loading

emilmgeorge commented Sep 11, 2024

Refactor hflip, vflip functions to allow preallocation #117

Refactor hflip, vflip functions to allow preallocation #117

Conversation

emilmgeorge commented Sep 1, 2024

emilmgeorge commented Sep 2, 2024

emilmgeorge commented Sep 10, 2024

edgarriba commented Sep 10, 2024

emilmgeorge commented Sep 11, 2024

edgarriba commented Sep 11, 2024 • edited Loading

emilmgeorge commented Sep 11, 2024

edgarriba commented Sep 11, 2024 •

edited

Loading