Time-memory Trade-offs for Saber+ on Memory-constrained RISC-V

Abstract

Saber is a module-lattice-based key encapsulation scheme that has been selected as a finalist in the NIST Post-Quantum Cryptography Standardization project. As Saber computes on considerably large (polynomial) matrices and vectors, its efficient implementation on memory-constrained IoT devices is very challenging. In this paper, we present an implementation of Saber with a minor tweak to the original Saber protocol for achieving reduced memory consumption and better performance. We call this tweaked implementation `Saber+'. Highly-optimized software implementation of Saber+ on a memory-constrained RISC-V platform achieves 48% performance improvement compared with the best state-of-the-art memory-optimized implementation of original Saber.

Specifically, we present various memory and performance optimizations for Saber+ on a memory-constrained RISC-V microcontroller, with merely 16KB of memory available. We utilize the Number Theoretic Transform (NTT) to speed up the polynomial multiplication in Saber+. For optimizing cycle counts and memory consumption during NTT, we carefully compare the efficiency of the complete and incomplete-NTTs, with platform-specific optimization. We implement 4-layers merging in the complete-NTT and 3-layers merging in the 6-layer incomplete-NTT. An improved on-the-fly generation of the public matrix and secret vector in Saber+ results in low memory footprint. Furthermore, by combining different optimization strategies, various time-memory trade-offs are explored. Our software implementation for Saber+ takes just 3,809K, 3,594K, and 3,193K clock cycles for key generation, encapsulation, and decapsulation, respectively, while consuming only 4.8KB of stack at most.

Different Trade-offs and in-complete NTT

We use different MACROs to choose different time-memory trade-offs and different in-complete NTTs. See src/SABER_params.h for detailed information.

Main MACROs used in our source code are as follows:

SABER_L: used to choose different variants of Saber. L=2 for LightSaber, L=3 for Saber, and L=4 for FireSaber.
FASTGENA_SLOWMUL: strategy 1 in our paper
FASTGENA_FASTMUL: strategy 2 in our paper
SLOWGENA_FASTMUL: strategy 3 in our paper
FIVE_LAYER_NTT: 5-layer incomplete-NTT
SIX_LAYER_NTT: 6-layer incomplete-NTT
SEVEN_LAYER_NTT: 7-layer incomplete-NTT
COMPLETE_NTT: 8-layer complete-NTT
NTTASM: define this MACRO will enable our assembly-NTT, otherwise the C-NTT is enabled

Platforms

SiFive Platform

Setup

Development Board: SiFive FE310 RISC-V CPU Experimental Setup: SiFive Freedom-E-SDK

Project:

benchmark/: benchmark related files
bsp/: board support package
help/: ipython script for searching parameters and c files for generating tables used in NTT.
src/: source files
.clang-format: format script for vscode
jlink.sh: jlink script for upload and debug
listen.py: python script for listening serial port

How to Run?

Attention: Due to the limited memory resources of SiFive platform, you should slightly modify our Makefile when you want to test different metrics (speed or memory) for different variants of Saber (L=2, 3, or 4). The main parameter needed to be modified is "__stack_size", and we give detailed comments about how to choose the value.

# compile
make
# upload executable file to board:
./jlink.sh --hex out/kem.hex --jlink JLinkExe
# debug using GDBServer
./jlink.sh --elf out/kem.elf --jlink JLinkGDBServer --gdb riscv64-unknown-elf-gdb

For the usage of GDBServer, we refer the readers to wiki.

If you want to test stack usage, modify the Makefile from

.PHONY: all
all: out/kem.elf

to

.PHONY: all
all: out/stack.elf

And the upload and debug commands become:

# compile
make
# upload executable file to board:
./jlink.sh --hex out/stack.hex --jlink JLinkExe
# debug using GDBServer
./jlink.sh --elf out/stack.elf --jlink JLinkGDBServer --gdb riscv64-unknown-elf-gdb

PQRISCV Platform

About the experimental setup, we refer to Kyber_RISC_V_Thesis, PQRISCV, and PQRISCV-VEXRISCV.

sbt "runMain mupq.PQVexRiscvSim --init ../Saber_RV32/out/kem_vexrv.bin"

Others

Code Size

riscv64-unknown-elf-nm out/kem.elf --print-size --size-sort --radix=d | \
grep -v '\<_\|\<metal\|\<pll_configs' | \
awk '{sum+=$2 ; print $0} END{print "Total size =", sum, "bytes =", sum/1024, "kB"}'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time-memory Trade-offs for Saber+ on Memory-constrained RISC-V

Abstract

Different Trade-offs and in-complete NTT

Platforms

SiFive Platform

Setup

How to Run?

PQRISCV Platform

Others

Code Size

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
benchmark		benchmark
bsp		bsp
help		help
src		src
.clang-format		.clang-format
.gitignore		.gitignore
Makefile		Makefile
MakefileVexRv.mk		MakefileVexRv.mk
README.md		README.md
gdb_script		gdb_script
generateTable		generateTable
jlink.sh		jlink.sh
listen.py		listen.py

Ji-Peng/Saber_RV32

Folders and files

Latest commit

History

Repository files navigation

Time-memory Trade-offs for Saber+ on Memory-constrained RISC-V

Abstract

Different Trade-offs and in-complete NTT

Platforms

SiFive Platform

Setup

How to Run?

PQRISCV Platform

Others

Code Size

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages