Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different round keys for columns 0,1 and 2,3 in AesGenerator4R #76

Merged
merged 1 commit into from
Jun 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 115 additions & 3 deletions doc/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -297,17 +297,24 @@ Using less than 256 MiB of memory is not possible due to the use of tradeoff-res

### 3.1 AesGenerator1R

AesGenerator1R was designed for the fastest possible generation of pseudorandom data to fill the Scratchpad. It takes advantage of hardware accelerated AES in modern CPUs. Only one AES round is performed per 16 bytes of output, which results in throughput exceeding 20 GB/s in most modern CPUs. While 1 AES round is not sufficient for a good distribution of random values, this is not an issue because the purpose is just to initialize the Scratchpad with random non-zero data.
AesGenerator1R was designed for the fastest possible generation of pseudorandom data to fill the Scratchpad. It takes advantage of hardware accelerated AES in modern CPUs. Only one AES round is performed per 16 bytes of output, which results in throughput exceeding 20 GB/s in most modern CPUs.

AesGenerator1R gives a good output distribution provided that it's initialized with a sufficiently 'random' initial state (see Appendix F).

### 3.2 AesGenerator4R

AesGenerator4R uses 4 AES rounds to generate pseudorandom data for Program Buffer initialization. Since 2 AES rounds are sufficient for full avalanche of all input bits [[28](https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf)], AesGenerator4R provides an excellent output distribution while maintaining very good performance.
AesGenerator4R uses 4 AES rounds to generate pseudorandom data for Program Buffer initialization. Since 2 AES rounds are sufficient for full avalanche of all input bits [[28](https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf)], AesGenerator4R has excellent statistical properties (see Appendix F) while maintaining very good performance.

The reversible nature of this generator is not an issue since the generator state is always initialized using the output of a non-reversible hashing function (Blake2b).

### 3.3 AesHash1R

AesHash was designed for the fastest possible calculation of the Scratchpad fingerprint. It interprets the Scratchpad as a set of AES round keys, so it's equivalent to AES encryption with 32768 rounds. Two extra rounds are performed at the end to ensure avalanche of all Scratchpad bits in each lane. The output of the AesHash is fed into the Blake2b hashing function to calculate the final PoW hash.
AesHash was designed for the fastest possible calculation of the Scratchpad fingerprint. It interprets the Scratchpad as a set of AES round keys, so it's equivalent to AES encryption with 32768 rounds. Two extra rounds are performed at the end to ensure avalanche of all Scratchpad bits in each lane.

The reversible nature of AesHash1R is not a problem for two main reasons:

* It is not possible to directly control the input of AesHash1R.
* The output of AesHash1R is passed into the Blake2b hashing function, which is not reversible.

### 3.4 SuperscalarHash

Expand Down Expand Up @@ -468,6 +475,109 @@ This shows that SuperscalaHash has quite low sensitivity to high-order bits and

When calculating a Dataset item, the input of the first SuperscalarHash depends only on the item number. To ensure a good distribution of results, the constants described in section 7.3 of the Specification were chosen to provide unique values of bits 3-53 for *all* item numbers in the range 0-34078718 (the Dataset contains 34078719 items). All initial register values for all Dataset item numbers were checked to make sure bits 3-53 of each register are unique and there are no collisions (source code: [superscalar-init.cpp](../src/tests/superscalar-init.cpp)). While this is not strictly necessary to get unique output from SuperscalarHash, it's a security precaution that mitigates the non-perfect avalanche properties of the randomly generated SuperscalarHash instances.

### F. Statistical tests of RNG

Both AesGenerator1R and AesGenerator4R were tested using the TestU01 library [[30](http://simul.iro.umontreal.ca/testu01/tu01.html)] intended for empirical testing of random number generators. The source code is available in [rng-tests.cpp](../src/tests/rng-tests.cpp).

The tests sample about 200 MB ("SmallCrush" test), 500 GB ("Crush" test) or 4 TB ("BigCrush" test) of output from each generator. This is considerably more than the amounts generated in RandomX (2176 bytes for AesGenerator4R and 2 MiB for AesGenerator1R), so failures in the tests don't necessarily imply that the generators are not suitable for their use case.


#### AesGenerator4R
The generator passes all tests in the "BigCrush" suite when initialized using the Blake2b hash function:

```
$ bin/rng-tests 1
state0 = 67e8bbe567a1c18c91a316faf19fab73
state1 = 39f7c0e0a8d96512c525852124fdc9fe
state2 = 7abb07b2c90e04f098261e323eee8159
state3 = 3df534c34cdfbb4e70f8c0e1826f4cf7

...

========= Summary results of BigCrush =========

Version: TestU01 1.2.3
Generator: AesGenerator4R
Number of statistics: 160
Total CPU time: 02:50:18.34

All tests were passed
```


The generator passes all tests in the "Crush" suite even with an initial state set to all zeroes.
```
$ bin/rng-tests 0
state0 = 00000000000000000000000000000000
state1 = 00000000000000000000000000000000
state2 = 00000000000000000000000000000000
state3 = 00000000000000000000000000000000

...

========= Summary results of Crush =========

Version: TestU01 1.2.3
Generator: AesGenerator4R
Number of statistics: 144
Total CPU time: 00:25:17.95

All tests were passed
```

#### AesGenerator1R

The generator passes all tests in the "Crush" suite when initialized using the Blake2b hash function.

```
$ bin/rng-tests 1
state0 = 67e8bbe567a1c18c91a316faf19fab73
state1 = 39f7c0e0a8d96512c525852124fdc9fe
state2 = 7abb07b2c90e04f098261e323eee8159
state3 = 3df534c34cdfbb4e70f8c0e1826f4cf7

...

========= Summary results of Crush =========

Version: TestU01 1.2.3
Generator: AesGenerator1R
Number of statistics: 144
Total CPU time: 00:25:06.07

All tests were passed

```

When the initial state is initialized to all zeroes, the generator fails 1 test out of 144 tests in the "Crush" suite:

```
$ bin/rng-tests 0
state0 = 00000000000000000000000000000000
state1 = 00000000000000000000000000000000
state2 = 00000000000000000000000000000000
state3 = 00000000000000000000000000000000

...

========= Summary results of Crush =========

Version: TestU01 1.2.3
Generator: AesGenerator1R
Number of statistics: 144
Total CPU time: 00:26:12.75
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):

Test p-value
----------------------------------------------
12 BirthdaySpacings, t = 3 1 - 4.4e-5
----------------------------------------------
All other tests were passed

```

## References

[1] CryptoNote whitepaper - https://cryptonote.org/whitepaper.pdf
Expand Down Expand Up @@ -528,3 +638,5 @@ Cryptocurrencies and Password Hashing - https://eprint.iacr.org/2015/430.pdf Tab
[28] J. Daemen, V. Rijmen: AES Proposal: Rijndael - https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf page 28

[29] 7-Zip File archiver - https://www.7-zip.org/

[30] TestU01 library - http://simul.iro.umontreal.ca/testu01/tu01.html
27 changes: 16 additions & 11 deletions doc/specs.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,41 +169,46 @@ state0 (16 B) state1 (16 B) state2 (16 B) state3 (16 B)

### 3.3 AesGenerator4R

AesGenerator4R works the same way as AesGenerator1R, except it uses 4 rounds per column:
AesGenerator4R works similar way as AesGenerator1R, except it uses 4 rounds per column. Columns 0 and 1 use a different set of keys than columns 2 and 3.

```
state0 (16 B) state1 (16 B) state2 (16 B) state3 (16 B)
| | | |
AES decrypt AES encrypt AES decrypt AES encrypt
(key0) (key0) (key0) (key0)
(key0) (key0) (key4) (key4)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key1) (key1) (key1) (key1)
(key1) (key1) (key5) (key5)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key2) (key2) (key2) (key2)
(key2) (key2) (key6) (key6)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key3) (key3) (key3) (key3)
(key3) (key3) (key7) (key7)
| | | |
v v v v
state0' state1' state2' state3'
```

AesGenerator4R uses the following 4 round keys:
AesGenerator4R uses the following 8 round keys:

```
key0 = 5d 46 90 f8 a6 e4 fb 7f b7 82 1f 14 95 9e 35 cf
key1 = 50 c4 55 6a 8a 27 e8 fe c3 5a 5c bd dc ff 41 67
key2 = a4 47 4c 11 e4 fd 24 d5 d2 9a 27 a7 ac 4a 32 3d
key3 = 2a 3a 0c 81 ff ae a9 99 d9 db d3 42 08 db f6 76
key0 = dd aa 21 64 db 3d 83 d1 2b 6d 54 2f 3f d2 e5 99
key1 = 50 34 0e b2 55 3f 91 b6 53 9d f7 06 e5 cd df a5
key2 = 04 d9 3e 5c af 7b 5e 51 9f 67 a4 0a bf 02 1c 17
key3 = 63 37 62 85 08 5d 8f e7 85 37 67 cd 91 d2 de d8
key4 = 73 6f 82 b5 a6 a7 d6 e3 6d 8b 51 3d b4 ff 9e 22
key5 = f3 6b 56 c7 d9 b3 10 9c 4e 4d 02 e9 d2 b7 72 b2
key6 = e7 c9 73 f2 8b a3 65 f7 0a 66 a9 2b a7 ef 3b f6
key7 = 09 d6 7c 7a de 39 58 91 fd d1 06 0c 2d 76 b0 c0
```
These keys were generated as:
```
key0, key1, key2, key3 = Hash512("RandomX AesGenerator4R keys")
key0, key1, key2, key3 = Hash512("RandomX AesGenerator4R keys 0-3")
key4, key5, key6, key7 = Hash512("RandomX AesGenerator4R keys 4-7")
```

### 3.4 AesHash1R
Expand Down
34 changes: 21 additions & 13 deletions src/aes_hash.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -157,23 +157,31 @@ void fillAes1Rx4(void *state, size_t outputSize, void *buffer) {
template void fillAes1Rx4<true>(void *state, size_t outputSize, void *buffer);
template void fillAes1Rx4<false>(void *state, size_t outputSize, void *buffer);

#define AES_GEN_4R_KEY0 0xcf359e95, 0x141f82b7, 0x7ffbe4a6, 0xf890465d
#define AES_GEN_4R_KEY1 0x6741ffdc, 0xbd5c5ac3, 0xfee8278a, 0x6a55c450
#define AES_GEN_4R_KEY2 0x3d324aac, 0xa7279ad2, 0xd524fde4, 0x114c47a4
#define AES_GEN_4R_KEY3 0x76f6db08, 0x42d3dbd9, 0x99a9aeff, 0x810c3a2a
#define AES_GEN_4R_KEY0 0x99e5d23f, 0x2f546d2b, 0xd1833ddb, 0x6421aadd
#define AES_GEN_4R_KEY1 0xa5dfcde5, 0x06f79d53, 0xb6913f55, 0xb20e3450
#define AES_GEN_4R_KEY2 0x171c02bf, 0x0aa4679f, 0x515e7baf, 0x5c3ed904
#define AES_GEN_4R_KEY3 0xd8ded291, 0xcd673785, 0xe78f5d08, 0x85623763
#define AES_GEN_4R_KEY4 0x229effb4, 0x3d518b6d, 0xe3d6a7a6, 0xb5826f73
#define AES_GEN_4R_KEY5 0xb272b7d2, 0xe9024d4e, 0x9c10b3d9, 0xc7566bf3
#define AES_GEN_4R_KEY6 0xf63befa7, 0x2ba9660a, 0xf765a38b, 0xf273c9e7
#define AES_GEN_4R_KEY7 0xc0b0762d, 0x0c06d1fd, 0x915839de, 0x7a7cd609

template<bool softAes>
void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;

rx_vec_i128 state0, state1, state2, state3;
rx_vec_i128 key0, key1, key2, key3;
rx_vec_i128 key0, key1, key2, key3, key4, key5, key6, key7;

key0 = rx_set_int_vec_i128(AES_GEN_4R_KEY0);
key1 = rx_set_int_vec_i128(AES_GEN_4R_KEY1);
key2 = rx_set_int_vec_i128(AES_GEN_4R_KEY2);
key3 = rx_set_int_vec_i128(AES_GEN_4R_KEY3);
key4 = rx_set_int_vec_i128(AES_GEN_4R_KEY4);
key5 = rx_set_int_vec_i128(AES_GEN_4R_KEY5);
key6 = rx_set_int_vec_i128(AES_GEN_4R_KEY6);
key7 = rx_set_int_vec_i128(AES_GEN_4R_KEY7);

state0 = rx_load_vec_i128((rx_vec_i128*)state + 0);
state1 = rx_load_vec_i128((rx_vec_i128*)state + 1);
Expand All @@ -183,23 +191,23 @@ void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
while (outptr < outputEnd) {
state0 = aesdec<softAes>(state0, key0);
state1 = aesenc<softAes>(state1, key0);
state2 = aesdec<softAes>(state2, key0);
state3 = aesenc<softAes>(state3, key0);
state2 = aesdec<softAes>(state2, key4);
state3 = aesenc<softAes>(state3, key4);

state0 = aesdec<softAes>(state0, key1);
state1 = aesenc<softAes>(state1, key1);
state2 = aesdec<softAes>(state2, key1);
state3 = aesenc<softAes>(state3, key1);
state2 = aesdec<softAes>(state2, key5);
state3 = aesenc<softAes>(state3, key5);

state0 = aesdec<softAes>(state0, key2);
state1 = aesenc<softAes>(state1, key2);
state2 = aesdec<softAes>(state2, key2);
state3 = aesenc<softAes>(state3, key2);
state2 = aesdec<softAes>(state2, key6);
state3 = aesenc<softAes>(state3, key6);

state0 = aesdec<softAes>(state0, key3);
state1 = aesenc<softAes>(state1, key3);
state2 = aesdec<softAes>(state2, key3);
state3 = aesenc<softAes>(state3, key3);
state2 = aesdec<softAes>(state2, key7);
state3 = aesenc<softAes>(state3, key7);

rx_store_vec_i128((rx_vec_i128*)outptr + 0, state0);
rx_store_vec_i128((rx_vec_i128*)outptr + 1, state1);
Expand Down
2 changes: 1 addition & 1 deletion src/tests/benchmark.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ int main(int argc, char** argv) {
std::cout << "Calculated result: ";
result.print(std::cout);
if (noncesCount == 1000 && seedValue == 0)
std::cout << "Reference result: 669ae4f2e5e2c0d9cc232ff2c37d41ae113fa302bbf983d9f3342879831b4edf" << std::endl;
std::cout << "Reference result: a925d346195ef38048e714709e0b24a88fef565fa02fa97127e00fac08ee6eb8" << std::endl;
if (!miningMode) {
std::cout << "Performance: " << 1000 * elapsed / noncesCount << " ms per hash" << std::endl;
}
Expand Down
93 changes: 93 additions & 0 deletions src/tests/rng-tests.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
/*
cd ~
wget http://simul.iro.umontreal.ca/testu01/TestU01.zip
unzip TestU01.zip
mkdir TestU01
cd TestU01-1.2.3
./configure --prefix=`pwd`/../TestU01
make -j8
make install
cd ~/RandomX
g++ -O3 src/tests/rng-tests.cpp -lm -I ~/TestU01/include -L ~/TestU01/lib -L bin/ -l:libtestu01.a -l:libmylib.a -l:libprobdist.a -lrandomx -o bin/rng-tests -DRANDOMX_GEN=4R -DRANDOMX_TESTU01=Crush
bin/rng-tests 0
*/

extern "C" {
#include "unif01.h"
#include "bbattery.h"
}

#include "../aes_hash.hpp"
#include "../blake2/blake2.h"
#include "utility.hpp"
#include <cstdint>

#ifndef RANDOMX_GEN
#error Please define RANDOMX_GEN with a value of 1R or 4R
#endif

#ifndef RANDOMX_TESTU01
#error Please define RANDOMX_TESTU01 with a value of SmallCrush, Crush or BigCrush
#endif

#define STR(x) #x
#define CONCAT(a,b,c) a ## b ## c
#define GEN_NAME(x) "AesGenerator" STR(x)
#define GEN_FUNC(x) CONCAT(fillAes, x, x4)
#define TEST_SUITE(x) CONCAT(bbattery_, x,)

constexpr int GeneratorStateSize = 64;
constexpr int GeneratorCapacity = GeneratorStateSize / sizeof(uint32_t);

static unsigned long aesGenBits(void *param, void *state) {
uint32_t* statePtr = (uint32_t*)state;
int* indexPtr = (int*)param;
int stateIndex = *indexPtr;
if(stateIndex >= GeneratorCapacity) {
GEN_FUNC(RANDOMX_GEN)<false>(statePtr, GeneratorStateSize, statePtr);
stateIndex = 0;
}
uint32_t next = statePtr[stateIndex];
*indexPtr = stateIndex + 1;
return next;
}

static double aesGenDouble(void *param, void *state) {
return aesGenBits (param, state) / unif01_NORM32;
}

static void aesWriteState(void* state) {
char* statePtr = (char*)state;
for(int i = 0; i < 4; ++i) {
std::cout << "state" << i << " = ";
outputHex(std::cout, statePtr + (i * 16), 16);
std::cout << std::endl;
}
}

int main(int argc, char** argv) {
if (argc != 2) {
std::cout << argv[0] << " <seed>" << std::endl;
return 1;
}
uint32_t state[GeneratorCapacity] = { 0 };
int stateIndex = GeneratorCapacity;
char name[] = GEN_NAME(RANDOMX_GEN);
uint64_t seed = strtoull(argv[1], nullptr, 0);
if(seed) {
blake2b(&state, sizeof(state), &seed, sizeof(seed), nullptr, 0);
}
unif01_Gen gen;
gen.state = &state;
gen.param = &stateIndex;
gen.Write = &aesWriteState;
gen.GetU01 = &aesGenDouble;
gen.GetBits = &aesGenBits;
gen.name = (char*)name;

gen.Write(gen.state);
std::cout << std::endl;

TEST_SUITE(RANDOMX_TESTU01)(&gen);
return 0;
}