Skip to content

Commit

Permalink
Introduce BLAKE3 checksums as an OpenZFS feature
Browse files Browse the repository at this point in the history
This commit adds BLAKE3 checksums to OpenZFS, it has similar
performance to Edon-R, but without the caveats around the latter.

Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3
Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3

Short description of Wikipedia:

  BLAKE3 is a cryptographic hash function based on Bao and BLAKE2,
  created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and
  Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real
  World Crypto. BLAKE3 is a single algorithm with many desirable
  features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE
  and BLAKE2, which are algorithm families with multiple variants.
  BLAKE3 has a binary tree structure, so it supports a practically
  unlimited degree of parallelism (both SIMD and multithreading) given
  enough input. The official Rust and C implementations are
  dual-licensed as public domain (CC0) and the Apache License.

Along with adding the BLAKE3 hash into the OpenZFS infrastructure a
new benchmarking file called chksum_bench was introduced.  When read
it reports the speed of the available checksum functions.

On Linux: cat /proc/spl/kstat/zfs/chksum_bench
On FreeBSD: sysctl kstat.zfs.misc.chksum_bench

This is an example output of an i3-1005G1 test system with Debian 11:

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1196    1602    1761    1749    1762    1759    1751
skein-generic      546     591     608     615     619     612     616
sha256-generic     240     300     316     314     304     285     276
sha512-generic     353     441     467     476     472     467     426
blake3-generic     308     313     313     313     312     313     312
blake3-sse2        402    1289    1423    1446    1432    1458    1413
blake3-sse41       427    1470    1625    1704    1679    1607    1629
blake3-avx2        428    1920    3095    3343    3356    3318    3204
blake3-avx512      473    2687    4905    5836    5844    5643    5374

Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1840    2458    2665    2719    2711    2723    2693
skein-generic      870     966     996     992    1003    1005    1009
sha256-generic     415     442     453     455     457     457     457
sha512-generic     608     690     711     718     719     720     721
blake3-generic     301     313     311     309     309     310     310
blake3-sse2        343    1865    2124    2188    2180    2181    2186
blake3-sse41       364    2091    2396    2509    2463    2482    2488
blake3-avx2        365    2590    4399    4971    4915    4802    4764

Output on Debian 5.10.0-9-powerpc64le system: (POWER 9)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1213    1703    1889    1918    1957    1902    1907
skein-generic      434     492     520     522     511     525     525
sha256-generic     167     183     187     188     188     187     188
sha512-generic     186     216     222     221     225     224     224
blake3-generic     153     152     154     153     151     153     153
blake3-sse2        391    1170    1366    1406    1428    1426    1414
blake3-sse41       352    1049    1212    1174    1262    1258    1259

Output on Debian 5.10.0-11-arm64 system: (Pi400)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic      487     603     629     639     643     641     641
skein-generic      271     299     303     308     309     309     307
sha256-generic     117     127     128     130     130     129     130
sha512-generic     145     165     170     172     173     174     175
blake3-generic      81      29      71      89      89      89      89
blake3-sse2        112     323     368     379     380     371     374
blake3-sse41       101     315     357     368     369     364     360

Structurally, the new code is mainly split into these parts:
- 1x cross platform generic c variant: blake3_generic.c
- 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512)
- 2x assembly for ARMv8 (NEON converted from SSE2)
- 2x assembly for PPC64-LE (POWER8 converted from SSE2)
- one file for switching between the implementations

Note the PPC64 assembly requires the VSX instruction set and the
kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Closes openzfs#10058
Closes openzfs#12918
  • Loading branch information
mcmilk authored and andrewc12 committed Sep 23, 2022
1 parent e69880b commit bacd07a
Show file tree
Hide file tree
Showing 53 changed files with 22,804 additions and 52 deletions.
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,7 @@ CONTRIBUTORS:
Tim Connors <tconnors@rather.puzzling.org>
Tim Crawford <tcrawford@datto.com>
Tim Haley <Tim.Haley@Sun.COM>
Tino Reichardt <milky-zfs@mcmilk.de>
Tobin Harding <me@tobin.cc>
Tom Caputi <tcaputi@datto.com>
Tom Matthews <tom@axiom-partners.com>
Expand Down
89 changes: 89 additions & 0 deletions cmd/ztest.c
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@
#include <sys/zfeature.h>
#include <sys/dsl_userhold.h>
#include <sys/abd.h>
#include <sys/blake3.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
Expand Down Expand Up @@ -417,6 +418,7 @@ ztest_func_t ztest_device_removal;
ztest_func_t ztest_spa_checkpoint_create_discard;
ztest_func_t ztest_initialize;
ztest_func_t ztest_trim;
ztest_func_t ztest_blake3;
ztest_func_t ztest_fletcher;
ztest_func_t ztest_fletcher_incr;
ztest_func_t ztest_verify_dnode_bt;
Expand Down Expand Up @@ -470,6 +472,7 @@ ztest_info_t ztest_info[] = {
ZTI_INIT(ztest_spa_checkpoint_create_discard, 1, &zopt_rarely),
ZTI_INIT(ztest_initialize, 1, &zopt_sometimes),
ZTI_INIT(ztest_trim, 1, &zopt_sometimes),
ZTI_INIT(ztest_blake3, 1, &zopt_rarely),
ZTI_INIT(ztest_fletcher, 1, &zopt_rarely),
ZTI_INIT(ztest_fletcher_incr, 1, &zopt_rarely),
ZTI_INIT(ztest_verify_dnode_bt, 1, &zopt_sometimes),
Expand Down Expand Up @@ -6373,6 +6376,92 @@ ztest_reguid(ztest_ds_t *zd, uint64_t id)
VERIFY3U(load, ==, spa_load_guid(spa));
}

void
ztest_blake3(ztest_ds_t *zd, uint64_t id)
{
(void) zd, (void) id;
hrtime_t end = gethrtime() + NANOSEC;
zio_cksum_salt_t salt;
void *salt_ptr = &salt.zcs_bytes;
struct abd *abd_data, *abd_meta;
void *buf, *templ;
int i, *ptr;
uint32_t size;
BLAKE3_CTX ctx;

size = ztest_random_blocksize();
buf = umem_alloc(size, UMEM_NOFAIL);
abd_data = abd_alloc(size, B_FALSE);
abd_meta = abd_alloc(size, B_TRUE);

for (i = 0, ptr = buf; i < size / sizeof (*ptr); i++, ptr++)
*ptr = ztest_random(UINT_MAX);
memset(salt_ptr, 'A', 32);

abd_copy_from_buf_off(abd_data, buf, 0, size);
abd_copy_from_buf_off(abd_meta, buf, 0, size);

while (gethrtime() <= end) {
int run_count = 100;
zio_cksum_t zc_ref1, zc_ref2;
zio_cksum_t zc_res1, zc_res2;

void *ref1 = &zc_ref1;
void *ref2 = &zc_ref2;
void *res1 = &zc_res1;
void *res2 = &zc_res2;

/* BLAKE3_KEY_LEN = 32 */
VERIFY0(blake3_set_impl_name("generic"));
templ = abd_checksum_blake3_tmpl_init(&salt);
Blake3_InitKeyed(&ctx, salt_ptr);
Blake3_Update(&ctx, buf, size);
Blake3_Final(&ctx, ref1);
zc_ref2 = zc_ref1;
ZIO_CHECKSUM_BSWAP(&zc_ref2);
abd_checksum_blake3_tmpl_free(templ);

VERIFY0(blake3_set_impl_name("cycle"));
while (run_count-- > 0) {

/* Test current implementation */
Blake3_InitKeyed(&ctx, salt_ptr);
Blake3_Update(&ctx, buf, size);
Blake3_Final(&ctx, res1);
zc_res2 = zc_res1;
ZIO_CHECKSUM_BSWAP(&zc_res2);

VERIFY0(memcmp(ref1, res1, 32));
VERIFY0(memcmp(ref2, res2, 32));

/* Test ABD - data */
templ = abd_checksum_blake3_tmpl_init(&salt);
abd_checksum_blake3_native(abd_data, size,
templ, &zc_res1);
abd_checksum_blake3_byteswap(abd_data, size,
templ, &zc_res2);

VERIFY0(memcmp(ref1, res1, 32));
VERIFY0(memcmp(ref2, res2, 32));

/* Test ABD - metadata */
abd_checksum_blake3_native(abd_meta, size,
templ, &zc_res1);
abd_checksum_blake3_byteswap(abd_meta, size,
templ, &zc_res2);
abd_checksum_blake3_tmpl_free(templ);

VERIFY0(memcmp(ref1, res1, 32));
VERIFY0(memcmp(ref2, res2, 32));

}
}

abd_free(abd_data);
abd_free(abd_meta);
umem_free(buf, size);
}

void
ztest_fletcher(ztest_ds_t *zd, uint64_t id)
{
Expand Down
2 changes: 2 additions & 0 deletions config/always-arch.m4
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_ARCH], [
;;
esac
AM_CONDITIONAL([TARGET_CPU_AARCH64], test $TARGET_CPU = aarch64)
AM_CONDITIONAL([TARGET_CPU_X86_64], test $TARGET_CPU = x86_64)
AM_CONDITIONAL([TARGET_CPU_POWERPC], test $TARGET_CPU = powerpc)
AM_CONDITIONAL([TARGET_CPU_SPARC64], test $TARGET_CPU = sparc64)
])
2 changes: 2 additions & 0 deletions include/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ COMMON_H = \
sys/avl.h \
sys/avl_impl.h \
sys/bitops.h \
sys/blake3.h \
sys/blkptr.h \
sys/bplist.h \
sys/bpobj.h \
Expand Down Expand Up @@ -117,6 +118,7 @@ COMMON_H = \
sys/zfeature.h \
sys/zfs_acl.h \
sys/zfs_bootenv.h \
sys/zfs_chksum.h \
sys/zfs_context.h \
sys/zfs_debug.h \
sys/zfs_delay.h \
Expand Down
2 changes: 2 additions & 0 deletions include/os/freebsd/spl/sys/ccompile.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,12 @@ extern "C" {

#ifndef LOCORE
#ifndef HAVE_RPC_TYPES
#ifndef _KERNEL
typedef int bool_t;
typedef int enum_t;
#endif
#endif
#endif

#ifndef __cplusplus
#define __init
Expand Down
34 changes: 27 additions & 7 deletions include/os/linux/kernel/linux/simd_powerpc.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,25 +57,45 @@
#include <sys/types.h>
#include <linux/version.h>

#define kfpu_allowed() 1
#define kfpu_begin() \
{ \
preempt_disable(); \
enable_kernel_altivec(); \
}
#define kfpu_allowed() 1

#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 5, 0)
#define kfpu_end() \
{ \
disable_kernel_vsx(); \
disable_kernel_altivec(); \
preempt_enable(); \
}
#define kfpu_begin() \
{ \
preempt_disable(); \
enable_kernel_altivec(); \
enable_kernel_vsx(); \
}
#else
/* seems that before 4.5 no-one bothered disabling ... */
/* seems that before 4.5 no-one bothered */
#define kfpu_begin()
#define kfpu_end() preempt_enable()
#endif
#define kfpu_init() 0
#define kfpu_fini() ((void) 0)

static inline boolean_t
zfs_vsx_available(void)
{
boolean_t res;
#if defined(__powerpc64__)
u64 msr;
#else
u32 msr;
#endif
kfpu_begin();
__asm volatile("mfmsr %0" : "=r"(msr));
res = (msr & 0x800000) != 0;
kfpu_end();
return (res);
}

/*
* Check if AltiVec instruction set is available
*/
Expand Down
120 changes: 120 additions & 0 deletions include/sys/blake3.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/

/*
* Based on BLAKE3 v1.3.1, https://github.com/BLAKE3-team/BLAKE3
* Copyright (c) 2019-2020 Samuel Neves and Jack O'Connor
* Copyright (c) 2021 Tino Reichardt <milky-zfs@mcmilk.de>
*/

#ifndef BLAKE3_H
#define BLAKE3_H

#ifdef _KERNEL
#include <sys/types.h>
#else
#include <stdint.h>
#include <stdlib.h>
#endif

#ifdef __cplusplus
extern "C" {
#endif

#define BLAKE3_KEY_LEN 32
#define BLAKE3_OUT_LEN 32
#define BLAKE3_MAX_DEPTH 54
#define BLAKE3_BLOCK_LEN 64
#define BLAKE3_CHUNK_LEN 1024

/*
* This struct is a private implementation detail.
* It has to be here because it's part of BLAKE3_CTX below.
*/
typedef struct {
uint32_t cv[8];
uint64_t chunk_counter;
uint8_t buf[BLAKE3_BLOCK_LEN];
uint8_t buf_len;
uint8_t blocks_compressed;
uint8_t flags;
} blake3_chunk_state_t;

typedef struct {
uint32_t key[8];
blake3_chunk_state_t chunk;
uint8_t cv_stack_len;

/*
* The stack size is MAX_DEPTH + 1 because we do lazy merging. For
* example, with 7 chunks, we have 3 entries in the stack. Adding an
* 8th chunk requires a 4th entry, rather than merging everything down
* to 1, because we don't know whether more input is coming. This is
* different from how the reference implementation does things.
*/
uint8_t cv_stack[(BLAKE3_MAX_DEPTH + 1) * BLAKE3_OUT_LEN];

/* const blake3_impl_ops_t *ops */
const void *ops;
} BLAKE3_CTX;

/* init the context for hash operation */
void Blake3_Init(BLAKE3_CTX *ctx);

/* init the context for a MAC and/or tree hash operation */
void Blake3_InitKeyed(BLAKE3_CTX *ctx, const uint8_t key[BLAKE3_KEY_LEN]);

/* process the input bytes */
void Blake3_Update(BLAKE3_CTX *ctx, const void *input, size_t input_len);

/* finalize the hash computation and output the result */
void Blake3_Final(const BLAKE3_CTX *ctx, uint8_t *out);

/* finalize the hash computation and output the result */
void Blake3_FinalSeek(const BLAKE3_CTX *ctx, uint64_t seek, uint8_t *out,
size_t out_len);

/* return number of supported implementations */
extern int blake3_get_impl_count(void);

/* return id of selected implementation */
extern int blake3_get_impl_id(void);

/* return name of selected implementation */
extern const char *blake3_get_impl_name(void);

/* setup id as fastest implementation */
extern void blake3_set_impl_fastest(uint32_t id);

/* set implementation by id */
extern void blake3_set_impl_id(uint32_t id);

/* set implementation by name */
extern int blake3_set_impl_name(const char *name);

/* set startup implementation */
extern void blake3_setup_impl(void);

#ifdef __cplusplus
}
#endif

#endif /* BLAKE3_H */
48 changes: 48 additions & 0 deletions include/sys/zfs_chksum.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/

/*
* Copyright (c) 2021 Tino Reichardt <milky-zfs@mcmilk.de>
*/

#ifndef _ZFS_CHKSUM_H
#define _ZFS_CHKSUM_H

#ifdef _KERNEL
#include <sys/types.h>
#else
#include <stdint.h>
#include <stdlib.h>
#endif

#ifdef __cplusplus
extern "C" {
#endif

/* Benchmark the chksums of ZFS when the module is loading */
void chksum_init(void);
void chksum_fini(void);

#ifdef __cplusplus
}
#endif

#endif /* _ZFS_CHKSUM_H */
Loading

0 comments on commit bacd07a

Please sign in to comment.