Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples/kdigest: add AF_ALG hash example #336

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

ddiss
Copy link
Contributor

@ddiss ddiss commented Apr 29, 2021

When built with CONFIG_CRYPTO_USER_API_HASH enabled, Linux exposes a
socket based API for hashing data. When coupled with uring
IOSQE_IO_LINK, file hashing can be done in quite an efficient manner,
as demonstrated in this link-cp.c based example.

Signed-off-by: David Disseldorp ddiss@suse.de

@ddiss
Copy link
Contributor Author

ddiss commented Apr 29, 2021

v1 was submitted via #335 .

examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
@ddiss
Copy link
Contributor Author

ddiss commented Apr 30, 2021

Changes since previous version:

  • add some more crude error checks
  • print newline after hash

@ddiss
Copy link
Contributor Author

ddiss commented May 6, 2021

@isilence any further thoughts on this patch, or should I drop it?
FWIW, basic benchmark results can be found at #336 (comment)

@ddiss
Copy link
Contributor Author

ddiss commented May 11, 2021

@isilence any further thoughts on this patch, or should I drop it?
FWIW, basic benchmark results can be found at #336 (comment)

@axboe feel free to close if there's no interest in this change.

@isilence
Copy link
Collaborator

I don't really mind it get merged if it's sane enough, but apparently nobody just have time to take a look. I'd rather assume it to stay until then

examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
examples/kdigest.c Outdated Show resolved Hide resolved
@axboe
Copy link
Owner

axboe commented Sep 18, 2021

@ddiss Were you planning on respinning this change?

@ddiss
Copy link
Contributor Author

ddiss commented Sep 18, 2021

@ddiss Were you planning on respinning this change?

Thanks for the ping. Yes, the respin is still on my todo list, but some other things got in the way. I'm hoping to return to this in a few weeks. I'll close here for now and reopen when ready.

@ddiss ddiss closed this Sep 18, 2021
@ddiss
Copy link
Contributor Author

ddiss commented Oct 1, 2024

I finally got a chance to revisit this, so will post an update here, hopefully addressing Pavel's feedback.

@ddiss ddiss reopened this Oct 1, 2024
@ddiss ddiss force-pushed the af_alg_hash_example_v2 branch 2 times, most recently from 3f9e67c to 017801f Compare October 1, 2024 14:10
@ddiss
Copy link
Contributor Author

ddiss commented Oct 1, 2024

changes since 2021:

  • rewrite dispatch logic to submit reads in parallel alongside linked writes I/Os
  • use a simple read/write index + state for tracking sequential submission
  • preallocate aligned I/O buffers

@axboe
Copy link
Owner

axboe commented Oct 1, 2024

Thanks for picking this up again! Can you address the CI failures? Looks like it's just basic complaints on things that should be static. I think this would be a good example to add.

When built with CONFIG_CRYPTO_USER_API_HASH enabled, Linux exposes a
socket based API for hashing data. When coupled with uring
IOSQE_IO_LINK, file hashing can be done in quite an efficient manner,
as demonstrated in this example.

Signed-off-by: David Disseldorp <ddiss@suse.de>
@ddiss
Copy link
Contributor Author

ddiss commented Oct 1, 2024

changes to address CI failures:

  • drop unused debug macro / statements
  • cast off_t for fprintf()

@axboe
Copy link
Owner

axboe commented Oct 1, 2024

Thanks!

@axboe axboe merged commit 189ea06 into axboe:master Oct 1, 2024
15 checks passed
@axboe
Copy link
Owner

axboe commented Oct 1, 2024

Rather than use IOSQE_IO_LINK, an improvement here may be to use ring provided buffers and bundle send to send data. That should be more efficient, at least on kernels that support both of those features.

@axboe
Copy link
Owner

axboe commented Oct 1, 2024

73a7003

That adds bundled sends. Didn't test it much, but seems to work.

@axboe
Copy link
Owner

axboe commented Oct 1, 2024

Ran a quick test on an amd box, and I ranges from ~5x faster at the lower sizes, to being even for 16MB/32MB. Ran your test on it. So whether that helped or not, not sure, but at least it's no longer slower for bigger sizes and it's still considerably faster for sizes that people would actually use.

@ddiss
Copy link
Contributor Author

ddiss commented Oct 2, 2024

Thanks for the follow up changes. FWIW, I did play around with merging buffers for the send, but didn't see much perf improvement for the added complexity. Bundling looks good.

@axboe
Copy link
Owner

axboe commented Oct 2, 2024

Right, with bundles it's just as easy as using links, at least. But I like having both in the example, as it shows how to use either one.

@ddiss ddiss deleted the af_alg_hash_example_v2 branch October 3, 2024 03:00
@ddiss
Copy link
Contributor Author

ddiss commented Oct 4, 2024

In case anyone is interested, below are some rough kdigest ( c90e5a6 ) perf measurements:

kdigest-c90e5a6

  \ FILE
   \SIZE    512 bytes     |       4096       |      65536       |     1048576      |     16777216     |     33554432     |
HASH\__|__________________|__________________|__________________|__________________|__________________|__________________|
   md5 | 0.0010984+-1.44% | 0.0008265+-2.65% | 0.0009556+-1.52% | 0.0024098+-0.35% | 0.0266533+-0.06% | 0.0521402+-0.19% |
  sha1 | 0.0011012+-1.59% | 0.0009430+-2.21% | 0.0009173+-1.89% | 0.0019097+-1.29% | 0.0186466+-0.18% | 0.0361400+-0.13% |
sha224 | 0.0010983+-1.29% | 0.0010970+-1.72% | 0.0010350+-1.32% | 0.0036209+-0.42% | 0.0425834+-0.06% | 0.0841893+-0.06% |
sha256 | 0.0010996+-1.34% | 0.0011159+-1.74% | 0.0010299+-1.09% | 0.0036085+-0.40% | 0.0426466+-0.12% | 0.0840805+-0.03% |
sha384 | 0.0011094+-1.58% | 0.0011204+-1.48% | 0.0009555+-2.96% | 0.0027775+-0.58% | 0.0296736+-0.27% |  0.058271+-0.27% |
sha512 | 0.0010909+-1.71% | 0.0010763+-3.21% | 0.0009746+-1.77% | 0.0027719+-0.74% | 0.0297744+-0.26% |  0.058239+-0.25% |

openssl-3.1.4-3.2

  \ FILE
   \SIZE    512 bytes     |       4096       |      65536       |     1048576      |     16777216     |     33554432     |
HASH\__|__________________|__________________|__________________|__________________|__________________|__________________|
   md5 | 0.0039263+-0.81% | 0.0029834+-1.69% | 0.0030833+-1.28% | 0.0044167+-0.87% | 0.0286969+-0.20% | 0.0536009+-0.16% |
  sha1 | 0.0039302+-1.16% | 0.0029809+-1.62% | 0.0030169+-0.99% | 0.0040051+-1.04% | 0.0220672+-0.29% | 0.0414711+-0.19% |
sha224 | 0.0039211+-0.99% | 0.0039417+-0.95% | 0.0031360+-1.43% | 0.0055392+-0.69% | 0.0408525+-0.18% |  0.078564+-0.20% |
sha256 | 0.0039277+-0.60% | 0.0039653+-0.97% | 0.0031659+-1.26% | 0.0055284+-0.76% | 0.0408774+-0.11% | 0.0788206+-0.10% |
sha384 | 0.0039370+-1.13% | 0.0039494+-0.87% | 0.0030840+-1.26% | 0.0047742+-0.81% |  0.029442+-0.35% |  0.056091+-0.24% |
sha512 | 0.0039456+-1.02% | 0.0039779+-1.09% | 0.0030739+-1.26% | 0.0047586+-0.55% | 0.0294435+-0.20% |  0.056350+-0.31% |

Benchmark script:

for size in $((32 * 1024 * 1024)) $((16 * 1024 * 1024)) $((1024 * 1024)) \
            $((64 * 1024)) $((4 * 1024)) 512; do
    dd if=/dev/urandom of="${size}.data" bs="$size" count=1 || break
    echo "==== hashing file of size $size ===="
    for i in md5 sha1 sha224 sha256 sha384 sha512; do
        # prime cache
        cat "${size}.data" > /dev/null
        perf stat --null -r 5 --table openssl "$i" "${size}.data" >/dev/null 2>openssl.${size}.${i}.perf
        perf stat --null -r 5 --table ~/liburing/examples/kdigest "$i" "${size}.data" >/dev/null 2>kdigest.${size}.${i}.perf
    done
done

Host details:

    Kernel: openSUSE Tumbleweed 6.11.0-1-default
    CPU: Intel(R) Xeon(R) CPU E3-1260L v5 @ 2.90GHz
    Thread(s) per core:   2
    Core(s) per socket:   4
    Socket(s):            1
    RAM: 64GB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants