Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: worker: Support delegating precommit2 to external binary #11185

Merged
merged 7 commits into from
Dec 1, 2023

Conversation

magik6k
Copy link
Contributor

@magik6k magik6k commented Aug 18, 2023

Related Issues

This PR implements support for external PC2 binaries #10983

Proposed Changes

Additional Info

The flag

--external-pc2 can be used to compute the PreCommit2 inputs externally.
The flag behaves similarly to the related lotus-worker flag, using it in
lotus-bench may be useful for testing if the external PreCommit2 command is
invoked correctly.
The command will be called with a number of environment variables set:
* EXTSEAL_PC2_SECTOR_NUM: the sector number
* EXTSEAL_PC2_SECTOR_MINER: the miner id
* EXTSEAL_PC2_PROOF_TYPE: the proof type
* EXTSEAL_PC2_SECTOR_SIZE: the sector size in bytes
* EXTSEAL_PC2_CACHE: the path to the cache directory
* EXTSEAL_PC2_SEALED: the path to the sealed sector file (initialized with unsealed data by the caller)
* EXTSEAL_PC2_PC1OUT: output from rust-fil-proofs precommit1 phase (base64 encoded json)
The command is expected to:
* Create cache sc-02-data-tree-r* files
* Create cache sc-02-data-tree-c* files
* Create cache p_aux / t_aux files
* Transform the sealed file in place
Example invocation of lotus-bench as external executor:
'./lotus-bench simple precommit2 --sector-size $EXTSEAL_PC2_SECTOR_SIZE $EXTSEAL_PC2_SEALED $EXTSEAL_PC2_CACHE $EXTSEAL_PC2_PC1OUT'

Can be tested in lotus-bench like so:

$ ./lotus-bench simple addpiece --sector-size 512MiB /dev/zero /tmp/unsealed1
2023-08-18T17:38:56.660+0200    INFO    lotus-bench     lotus-bench/main.go:110 Starting lotus-bench
AddPiece 662.493804ms (772.8 MiB/s)
baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq 536870912

$ ./lotus-bench simple precommit1 --sector-size 512MiB /tmp/unsealed1 /tmp/sealed /tmp/cache baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq 536870912
2023-08-18T17:39:19.302+0200    INFO    lotus-bench     lotus-bench/main.go:110 Starting lotus-bench
2023-08-18T17:39:19.302+0200    WARN    ffiwrapper      ffiwrapper/sealer_cgo.go:772    existing cache in /tmp/cache; removing
PreCommit1 31.85043704s (16.08 MiB/s)
eyJfbG90dXNfU2VhbFJhbmRvbW5l

$ GOLOG_LOG_LEVEL=debug ./lotus-bench simple precommit2 --sector-size 512MiB --external-pc2 './lotus-bench simple precommit2 --sector-size $EXTSEAL_PC2_SECTOR_SIZE $EXTSEAL_PC2_SEALED $EXTSEAL_PC2_CACHE $EXTSEAL_PC2_PC1OUT' /tmp/sealed /tmp/cache eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBPSIsImNvbW1fZCI6WzU3LDg2LDE0LDEyMywxOSwxNjksNTksNywxNjIsNjcsMjUzLDM5LDMyLDI1NSwxNjcsMjAzLDYyLDI5LDQ2LDgwLDkwLDE3OSw5OCwxNTgsMTIxLDI0NCw5OSwxOSw4MSw0NCwyMTgsNl0sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiIvdG1wL2NhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjozMzU1NDQzMX0sImxhYmVscyI6eyJTdGFja2VkRHJnNTEyTWlCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6Ii90bXAvY2FjaGUiLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjE2Nzc3MjE2fSx7ImlkIjoibGF5ZXItMiIsInBhdGgiOiIvdG1wL2NhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxNjc3NzIxNn1dfX0sInJlZ2lzdGVyZWRfcHJvb2YiOiJTdGFja2VkRHJnNTEyTWlCVjFfMSJ9
2023-08-18T17:40:25.846+0200    INFO    fil-consensus   filcns/upgrades.go:81   migration worker count: 32
2023-08-18T17:40:25.847+0200    INFO    lotus-bench     lotus-bench/main.go:110 Starting lotus-bench
2023-08-18T17:40:25.847+0200    INFO    ffiwrapper      ffiwrapper/extern_pc2.go:53     running external sealing call   {"method": "precommit2", "command": "./lotus-bench simple precommit2 --sector-size $EXTSEAL_PC2_SECTOR_SIZE $EXTSEAL_PC2_SEALED $EXTSEAL_PC2_CACHE $EXTSEAL_PC2_PC1OUT", "env": ["EXTSEAL_PC2_SECTOR_NUM=1", "EXTSEAL_PC2_SECTOR_MINER=1000", "EXTSEAL_PC2_PROOF_TYPE=7", "EXTSEAL_PC2_SECTOR_SIZE=536870912", "EXTSEAL_PC2_CACHE=/tmp/cache", "EXTSEAL_PC2_SEALED=/tmp/sealed", "EXTSEAL_PC2_PC1OUT=eyJfbG9...MSJ9"]}
2023-08-18T17:40:25.900+0200    INFO    lotus-bench     lotus-bench/main.go:110 Starting lotus-bench
PreCommit2 35.292932552s (14.51 MiB/s)
d:baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq r:bagboea4b5abcbztu2gpgzz746m537wntioqm5mjnfay5dwsugfqyshv4zljmnwyb
PreCommit2 35.456637671s (14.44 MiB/s)
d:baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq r:bagboea4b5abcbztu2gpgzz746m537wntioqm5mjnfay5dwsugfqyshv4zljmnwyb

TODO:

  • Run lotus-worker with PC2 backed by lotus-bench simple
  • Run this with SN PC2
    • in lotus-bench
    • in lotus-worker

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • New features have usage guidelines and / or documentation updates in
  • Tests exist for new functionality or change in behavior
  • CI is green

@magik6k magik6k requested a review from a team as a code owner August 18, 2023 15:42
@rjan90 rjan90 linked an issue Aug 19, 2023 that may be closed by this pull request
@magik6k magik6k force-pushed the feat/snpc2 branch 3 times, most recently from 8b9c17a to 8c5a5d0 Compare August 21, 2023 13:17
@magik6k
Copy link
Contributor Author

magik6k commented Aug 22, 2023

SupraSeal command looks something like this:

CUDA_VISIBLE_DEVICES="GPU-..." ./pc2 -d "${EXTSEAL_PC2_SEALED}" -i "${EXTSEAL_PC2_CACHE}" -o "${EXTSEAL_PC2_CACHE}" && rm "${EXTSEAL_PC2_SEALED}" && mv "${EXTSEAL_PC2_CACHE}/sealed-file" "${EXTSEAL_PC2_SEALED}

@magik6k
Copy link
Contributor Author

magik6k commented Aug 22, 2023

sn-pc2 testing with lotus-bench:

[Get ubuntu 22.04]
$ sudo apt install build-essential libconfig++-dev libgmp-dev wget git curl

[cuda]
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
$ sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
$ sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
$ sudo apt-get update
$ sudo apt-get -y install cuda
$ export PATH=$PATH:/usr/local/cuda-12/bin/

[rust]
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$ source $HOME/.cargo/env

[lotus-bench]
$ git clone https://github.com/filecoin-project/lotus
$ cd lotus
$ make deps lotus-bench
$ cd ..

[sn-pc2]
$ git clone https://github.com/supranational/supra_seal.git
$ cd supra_seal/
$ ./build.sh 512MiB
$ cd ..

[test]
$ mkdir benchdir && cd benchdir
$ cp ../lotus/lotus-bench .
$ cp ../supra_seal/bin/pc2 .
$ cp ../supra_seal/demos/rust/supra_seal.cfg .

$ ./lotus-bench simple addpiece --sector-size 512M /dev/random s-unsealed
AddPiece 6.379320196s (80.26 MiB/s)
baga6ea4seaqdkvp563jl2wmi62ubhsf4kxkvkmofjtzkpoe66ygb34rinmy2oiy 536870912

$ ./lotus-bench simple precommit1 --sector-size 512M s-unsealed s-sealed s-cache baga6ea4seaqdkvp563jl2wmi62ubhsf4kxkvkmofjtzkpoe66ygb34rinmy2oiy 536870912
PreCommit1 2m44.182148987s (3.118 MiB/s)
eyJfbG90dXNfU2VhbFJhbmR...

$ GOLOG_LOG_LEVEL=debug ./lotus-bench simple precommit2 --sector-size 512MiB --external-pc2 'CUDA_VISIBLE_DEVICES="GPU-c30f7c2e" ./pc2 -c supra_seal.cfg -d "${EXTSEAL_PC2_SEALED}" -i "${EXTSEAL_PC2_CACHE}" -o "${EXTSEAL_PC2_CACHE}" && rm "${EXTSEAL_PC2_SEALED}" && mv "${EXTSEAL_PC2_CACHE}/sealed-file" "${EXTSEAL_PC2_SEALED}"' s-sealed s-cache eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBPSIsImNvbW1fZCI6WzUzLDg1LDI1MywyNDYsMjEwLDE4OSw4OSwxMzYsMjQ2LDE2OCwxOSwyMDAsMTg4LDg1LDIxMyw4NSw0OSwxOTcsNzYsMjQyLDE2NywxODQsMTU4LDI0NiwxMiwyOSwyNDIsNDAsMTA3LDQ5LDE2NywzNV0sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjozMzU1NDQzMX0sImxhYmVscyI6eyJTdGFja2VkRHJnNTEyTWlCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6InMtY2FjaGUiLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjE2Nzc3MjE2fSx7ImlkIjoibGF5ZXItMiIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxNjc3NzIxNn1dfX0sInJlZ2lzdGVyZWRfcHJvb2YiOiJTdGFja2VkRHJnNTEyTWlCVjFfMSJ9
2023-08-22T11:32:48.295Z	INFO	fil-consensus	filcns/upgrades.go:81	migration worker count: 104
2023-08-22T11:32:48.295Z	INFO	lotus-bench	lotus-bench/main.go:110	Starting lotus-bench
2023-08-22T11:32:48.295Z	INFO	ffiwrapper	ffiwrapper/extern_pc2.go:54	running external sealing call	{"method": "precommit2", "command": "CUDA_VISIBLE_DEVICES=\"GPU-c30f7c2e\" ./pc2 -c supra_seal.cfg -d \"${EXTSEAL_PC2_SEALED}\" -i \"${EXTSEAL_PC2_CACHE}\" -o \"${EXTSEAL_PC2_CACHE}\" && rm \"${EXTSEAL_PC2_SEALED}\" && mv \"${EXTSEAL_PC2_CACHE}/sealed-file\" \"${EXTSEAL_PC2_SEALED}\"", "env": ["EXTSEAL_PC2_SECTOR_NUM=1", "EXTSEAL_PC2_SECTOR_MINER=1000", "EXTSEAL_PC2_PROOF_TYPE=7", "EXTSEAL_PC2_SECTOR_SIZE=536870912", "EXTSEAL_PC2_CACHE=s-cache", "EXTSEAL_PC2_SEALED=s-sealed", "EXTSEAL_PC2_PC1OUT=eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBPSIsImNvbW1fZCI6WzUzLDg1LDI1MywyNDYsMjEwLDE4OSw4OSwxMzYsMjQ2LDE2OCwxOSwyMDAsMTg4LDg1LDIxMyw4NSw0OSwxOTcsNzYsMjQyLDE2NywxODQsMTU4LDI0NiwxMiwyOSwyNDIsNDAsMTA3LDQ5LDE2NywzNV0sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjozMzU1NDQzMX0sImxhYmVscyI6eyJTdGFja2VkRHJnNTEyTWlCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6InMtY2FjaGUiLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjE2Nzc3MjE2fSx7ImlkIjoibGF5ZXItMiIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxNjc3NzIxNn1dfX0sInJlZ2lzdGVyZWRfcHJvb2YiOiJTdGFja2VkRHJnNTEyTWlCVjFfMSJ9"]}
config file               supra_seal.cfg
data_filename input       s-sealed
input cache_dir           s-cache
out_dir                   s-cache
sectors 2
  coord 8 hashers 1
sectors 4
  coord 8 hashers 2
sectors 8
  coord 8 hashers 4
sectors 16
  coord 8 hashers 8
sectors 32
  coord 8 hashers 14
  coord 16 hashers 2
sectors 64
  coord 8 hashers 14
  coord 16 hashers 14
  coord 24 hashers 4
sectors 128
  coord 8 hashers 14
  coord 16 hashers 14
  coord 24 hashers 14
  coord 32 hashers 14
  coord 40 hashers 8
Partition 0 took 3 seconds (gpu 3, cpu 0)
pc2 took 9 seconds utilizing 29127.1 iOPS
2023-08-22T11:33:04.170Z	WARN	ffiwrapper	ffiwrapper/sealer_cgo.go:883	checking PreCommit failed: could not read file t_aux="s-cache/t_aux"

Caused by:
    No such file or directory (os error 2)
2023-08-22T11:33:04.170Z	WARN	ffiwrapper	ffiwrapper/sealer_cgo.go:884	num:1 tkt:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] seed:[140 152 50 161 81 19 141 128 58 17 127 103 37 88 221 42 215 252 255 31 201 228 33 88 161 88 247 198 151 24 8 98] sealedCID:bagboea4b5abcbd5cqhxwlx7dzkq4o3zfy6z6lqshumgiyh2ibrxgzha6a7exlo2p, unsealedCID:baga6ea4seaqdkvp563jl2wmi62ubhsf4kxkvkmofjtzkpoe66ygb34rinmy2oiy
2023-08-22T11:33:04.170Z	WARN	lotus-bench	lotus-bench/main.go:128	precommit2:
    main.glob..func9
        /root/lotus/cmd/lotus-bench/simple.go:382
  - checking PreCommit failed:
    github.com/filecoin-project/lotus/storage/sealer/ffiwrapper.(*Sealer).SealPreCommit2
        /root/lotus/storage/sealer/ffiwrapper/sealer_cgo.go:886
  - could not read file t_aux="s-cache/t_aux"

Caused by:
    No such file or directory (os error 2)

$ ls s-cache/
p_aux  sc-02-data-layer-1.dat  sc-02-data-layer-2.dat  sc-02-data-tree-c.dat  sc-02-data-tree-d.dat  sc-02-data-tree-r-last.dat

So it looks like we either need to come up with a way to create t_aux in Go, or make it be created by sn-pc2.
I'm not sure what's in there, so some more digging to do.

Checking that output matches rust pc2:

[with data from sn-pc2 run]
root@lxc2-sn:~/benchdir# hexdump -C s-sealed | head
00000000  1d c0 e4 a2 4f 80 7b 1c  9d ee c4 5b 85 5f 10 cc  |....O.{....[._..|
00000010  9a b7 6d 2c f2 c7 63 11  e0 c8 e1 ce 6a 5e 8c 4e  |..m,..c.....j^.N|
00000020  31 35 79 b5 9c a7 e8 22  76 7b 7b 40 af 6e b1 33  |15y...."v{{@.n.3|
00000030  bf 10 23 a4 e1 91 7c f5  3e 0d 4e 07 02 fb ce 4f  |..#...|.>.N....O|
00000040  50 70 fd 9c d3 18 f2 71  93 32 6a 7f 5b 4a 7a 01  |Pp.....q.2j.[Jz.|
...

$ rm -r s-sealed s-cache/
$ ./lotus-bench simple precommit1 --sector-size 512M s-unsealed s-sealed s-cache baga6ea4seaqdkvp563jl2wmi62ubhsf4kxkvkmofjtzkpoe66ygb34rinmy2oiy 536870912
$ ./lotus-bench simple precommit2 --sector-size 512MiB s-sealed s-cache eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOi...
$ hexdump -C s-sealed | head
00000000  1d c0 e4 a2 4f 80 7b 1c  9d ee c4 5b 85 5f 10 cc  |....O.{....[._..|
00000010  9a b7 6d 2c f2 c7 63 11  e0 c8 e1 ce 6a 5e 8c 4e  |..m,..c.....j^.N|
00000020  31 35 79 b5 9c a7 e8 22  76 7b 7b 40 af 6e b1 33  |15y...."v{{@.n.3|
00000030  bf 10 23 a4 e1 91 7c f5  3e 0d 4e 07 02 fb ce 4f  |..#...|.>.N....O|
00000040  50 70 fd 9c d3 18 f2 71  93 32 6a 7f 5b 4a 7a 01  |Pp.....q.2j.[Jz.|
...

Which means we're getting the correct sealed output!

Now we just need to figure out the t_aux stuff, and this PR should be good to go!

@vmx
Copy link
Contributor

vmx commented Sep 1, 2023

triplewz/poseidon#1 got merged! => no more fork is needed :)

@vmx
Copy link
Contributor

vmx commented Sep 22, 2023

The PC2 binary mentioned in that PR is kind of outdated. Supranational releases a new one at:
https://github.com/supranational/supra_seal/blob/93f4a80b1c370acb1e07089047f86428db9a6cb0/tools/pc2.cu

Currently it only supports 512MiB and 32GiB and it has to be set at compile time. But making it possible to change at runtime is in the works (see supranational/supra_seal#23 (comment)) and should hopefully land soon.

@hail100
Copy link

hail100 commented Oct 11, 2023

@vmx @magik6k Support additional sector sizes has been merged.
supranational/supra_seal#40

@vmx
Copy link
Contributor

vmx commented Oct 16, 2023

@magik6k The proofs side of things sadly still is not ready yet. Though Lotus would want to switch to the binary at https://github.com/supranational/supra_seal/blob/c2ff1acf282dad86812c62f5f431db545d966743/tools/pc2.cu. So perhaps that work can be done in parallel, while the Rust side gets finished.

@Stebalien Stebalien marked this pull request as draft October 27, 2023 18:35
@vmx
Copy link
Contributor

vmx commented Nov 3, 2023

@magik6k In order to use proofs and not need a t_aux file use this PR filecoin-project/filecoin-ffi#434 and set the FFI_USE_FIXED_ROWS_TO_DISCARD=1.
=> from the proofs side it should now be ready to go.

@mtrisic
Copy link
Contributor

mtrisic commented Nov 3, 2023

@magik6k In order to use proofs and not need a t_aux file use this PR filecoin-project/filecoin-ffi#434 and set the FFI_USE_FIXED_ROWS_TO_DISCARD=1. => from the proofs side it should now be ready to go.

Please be aware that not generating t_aux may break backwards compatibility with other remote sealing workers later during sealing process. Sector transfers may also fail if t_aux is missing.
Additionaly, i think t_aux is expected by current versions of lotus-worker when doing cc sector upgrade (snap-up deal) 🤔

@vmx
Copy link
Contributor

vmx commented Nov 3, 2023

Additionaly, i think t_aux is expected by current versions of lotus-worker when doing cc sector upgrade (snap-up deal) 🤔

Yes. Everything would need to be on the same version.

@vmx
Copy link
Contributor

vmx commented Nov 16, 2023

@rjan90 here's a simpler build script for building the PC2 binary, which doesn't need a SPDK checkout:

#!/bin/sh
set -eu
#set -o xtrace

# By default compile for 512MiB and 32GiB sectors only, use `-r` to compile for
# other sector test sector sizes as well.
SECTOR_SIZE=""
while getopts r flag
do
    case "${flag}" in
        r) SECTOR_SIZE="-DRUNTIME_SECTOR_SIZE";;
        *) ;;
    esac
done

if [ ! -d "supra_seal" ]; then
    git clone https://github.com/supranational/supra_seal.git
fi

cd supra_seal

rm -fr obj
mkdir -p obj

rm -fr bin
mkdir -p bin

mkdir -p deps
if [ ! -d "deps/sppark" ]; then
    git clone https://github.com/supranational/sppark.git deps/sppark
fi
if [ ! -d "deps/blst" ]; then
    git clone https://github.com/supranational/blst.git deps/blst
    (cd deps/blst
     ./build.sh -march=native)
fi

# Generate .h files for the Poseidon constants
xxd -i poseidon/constants/constants_2  > obj/constants_2.h
xxd -i poseidon/constants/constants_4  > obj/constants_4.h
xxd -i poseidon/constants/constants_8  > obj/constants_8.h
xxd -i poseidon/constants/constants_11 > obj/constants_11.h
xxd -i poseidon/constants/constants_16 > obj/constants_16.h
xxd -i poseidon/constants/constants_24 > obj/constants_24.h
xxd -i poseidon/constants/constants_36 > obj/constants_36.h

nvcc ${SECTOR_SIZE} -DNO_SPDK -DSTREAMING_NODE_READER_FILES \
     -arch=sm_80 -gencode arch=compute_70,code=sm_70 -t0 \
     -std=c++17 -g -O3 -Xcompiler -march=native \
     -Xcompiler -Wall,-Wextra,-Werror \
     -Xcompiler -Wno-subobject-linkage,-Wno-unused-parameter \
     -x cu tools/pc2.cu -o bin/pc2 \
     -Iposeidon -Ideps/sppark -Ideps/sppark/util -Ideps/blst/src -L deps/blst -lblst -lconfig++

It also checks out the supra_seal repo. For all those git checkouts it would make sense to checkout a certain commit/tag instead of the main branches, so that it doesn't change/break in unexpected ways.

@rjan90
Copy link
Contributor

rjan90 commented Nov 16, 2023

Got the SupraSeal PC2 to work with. The testing that I did:

1:
./lotus-bench simple addpiece --sector-size 512M /dev/zero s-unsealed
2023-11-16T17:14:07.216Z	INFO	lotus-bench	lotus-bench/main.go:110	Starting lotus-bench
2023-11-16T17:14:07.235 DEBUG filcrypto::util::types > generate_piece_commitment: start
2023-11-16T17:14:07.245 DEBUG filcrypto::util::types > generate_piece_commitment: start
---------
2023-11-16T17:14:08.730 DEBUG filcrypto::util::types > generate_data_commitment: end
AddPiece 1.513327102s (338.3 MiB/s)
baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq 536870912

2. 
./lotus-bench simple precommit1 --sector-size 512M s-unsealed s-sealed s-cache baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq 536870912
2023-11-16T17:14:22.807Z	INFO	lotus-bench	lotus-bench/main.go:110	Starting lotus-bench
2023-11-16T17:14:22.808 DEBUG filcrypto::util::types > seal_pre_commit_phase1: start
2023-11-16T17:14:22.808 INFO filecoin_proofs::api::seal > seal_pre_commit_phase1:start: SectorId(1)
2023-11-16T17:14:24.838 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase1
-----
2023-11-16T17:17:08.423 DEBUG filcrypto::util::types > seal_pre_commit_phase1: end
PreCommit1 2m45.615351693s (3.092 MiB/s)
eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBPSIsImNvbW1fZCI6WzU3LDg2LDE0LDEyMywxOSwxNjksNTksNywxNjIsNjcsMjUzLDM5LDMyLDI1NSwxNjcsMjAzLDYyLDI5LDQ2LDgwLDkwLDE3OSw5OCwxNTgsMTIxLDI0NCw5OSwxOSw4MSw0NCwyMTgsNl0sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjowLCJzaXplIjozMzU1NDQzMX0sImxhYmVscyI6eyJTdGFja2VkRHJnNTEyTWlCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6InMtY2FjaGUiLCJyb3dzX3RvX2Rpc2NhcmQiOjAsInNpemUiOjE2Nzc3MjE2fSx7ImlkIjoibGF5ZXItMiIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjowLCJzaXplIjoxNjc3NzIxNn1dfX0sInJlZ2lzdGVyZWRfcHJvb2YiOiJTdGFja2VkRHJnNTEyTWlCVjFfMSJ9

3.
GOLOG_LOG_LEVEL=debug RUST_LOG=trace ./lotus-bench simple precommit2 --sector-size 512MiB --external-pc2 './pc2 -b 512MiB -c supra_seal.cfg -i "${EXTSEAL_PC2_CACHE}" -o "${EXTSEAL_PC2_CACHE}" -d "${EXTSEAL_PC2_UNSEALED}" && rm -f "${EXTSEAL_PC2_SEALED}" && mv "${EXTSEAL_PC2_CACHE}/sealed-file" "${EXTSEAL_PC2_SEALED}"' s-sealed s-cache eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBPSIsImNvbW1fZCI6WzE0OCwxMTMsNjQsMTE3LDEyMSwxNjQsMTMsMTY3LDIxNyw3MSwyMiwyMTMsODksMTgxLDExOCwxOTksMTUsMTM3LDEzMiwzOSwyMjEsMjQsMTcsMTIwLDEzNSwyMzgsMTAxLDUwLDcxLDc5LDE2Miw2M10sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjozMzU1NDQzMX0sImxhYmVscyI6eyJTdGFja2VkRHJnNTEyTWlCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6InMtY2FjaGUiLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjE2Nzc3MjE2fSx7ImlkIjoibGF5ZXItMiIsInBhdGgiOiJzLWNhY2hlIiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxNjc3NzIxNn1dfX0sInJlZ2lzdGVyZWRfcHJvb2YiOiJTdGFja2VkRHJnNTEyTWlCVjFfMSJ9
------
Partition 0 took 0 seconds (gpu 0, cpu 0)
pc2 took 3 seconds utilizing 87381.3 iOPS
------
PreCommit2 7.664052468s (66.81 MiB/s)
d:baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq r:bagboea4b5abcbztu2gpgzz746m537wntioqm5mjnfay5dwsugfqyshv4zljmnwyb

4.
./lotus-bench simple commit1 --sector-size 512MiB /home/misty/benchdir/s-sealed /home/misty/benchdir/s-cache baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq bagboea4b5abcbztu2gpgzz746m537wntioqm5mjnfay5dwsugfqyshv4zljmnwyb /home/misty/benchdir/c1.json
2023-11-16T17:28:31.911Z	INFO	lotus-bench	lotus-bench/main.go:110	Starting lotus-bench
2023-11-16T17:28:31.912 DEBUG filcrypto::util::types > seal_commit_phase1: start
----
2023-11-16T17:28:31.970 INFO filecoin_proofs::api::seal > seal_commit_phase1:finish: SectorId(1)
2023-11-16T17:28:31.971 DEBUG filcrypto::util::types > seal_commit_phase1: end
Commit1 59.446154ms (8.411 GiB/s)

I got this on the C2-part of the bench though, but unclear if this is the just the lotus-bench code being weird.

./lotus-bench simple commit2 /home/misty/benchdir/c1.json
2023-11-16T17:51:54.256 DEBUG filcrypto::util::types > seal_commit_phase2: start
2023-11-16T17:51:54.257 INFO filecoin_proofs::api::seal > seal_commit_phase2:start: SectorId(1)
2023-11-16T17:51:54.257 DEBUG filcrypto::util::types > seal_commit_phase2: end
2023-11-16T17:51:54.257Z	WARN	lotus-bench	lotus-bench/main.go:129	Invalid porep challenge seed

^^ Update on the C2-error here. Which was fixed with: #11429

@magik6k
Copy link
Contributor Author

magik6k commented Nov 27, 2023

Some testing which would be nice to do to gain confidence:

On a devnet or calib:

  • Seal sectors with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD
    • With deals
      • Make sure it PoSts
    • CC
      • Make sure it PoSts
      • Snap up that sector
        • Make sure it PoSts
  • Seal sectors with FFI built with FFI_USE_FIXED_ROWS_TO_DISCARD
    • With deals
      • Make sure it PoSts
    • CC
      • Make sure it PoSts
      • Snap up that sector
        • Make sure it PoSts
  • Seal sectors with FFI built with FFI_USE_FIXED_ROWS_TO_DISCARD with SN PC2
    • With deals
      • Make sure it PoSts
    • CC
      • Make sure it PoSts
      • Snap up that sector
        • Make sure it PoSts
  • Make sure that sectors sealed with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD can be posted with ffi with FFI_USE_FIXED_ROWS_TO_DISCARD=1

@magik6k magik6k marked this pull request as ready for review November 27, 2023 14:32
@rjan90
Copy link
Contributor

rjan90 commented Nov 28, 2023

Testing in Butterfly-network:

  • Seal sectors with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD=1
    • With deals
    /storage/cache/s-t01008-2
    ls
    p_aux  sc-02-data-tree-r-last.dat  t_aux
    ------
     /storage/unsealed/
    ls
    fetching  s-t01008-2
    
    • Make sure it PoSts
    • CC
    /storage/cache/s-t01008-1
    ls
    p_aux  sc-02-data-tree-r-last.dat  t_aux
    
    • Make sure it PoSts
    • Snap up that sector
      • Make sure it PoSts
  • Seal sectors with FFI built with FFI_USE_FIXED_ROWS_TO_DISCARD=1
    • With deals
    /storage/cache/s-t01008-7
    ls
    p_aux  sc-02-data-tree-r-last.dat
    ------------
    /storage/unsealed/
    ls
    fetching  s-t01008-2  s-t01008-4  s-t01008-7
    
    • Make sure it PoSts
    • CC
    /storage/cache/s-t01008-0
    ls
    p_aux  sc-02-data-tree-r-last.dat
    
    • Make sure it PoSts
    • Snap up that sector
    storage/update-cache/s-t01008-0
    ls
    p_aux  sc-02-data-tree-r-last.dat
    ------
    /storage/update/
    ls
    fetching  s-t01008-0
    
    • Make sure it PoSts
  • Seal sectors with FFI built with FFI_USE_FIXED_ROWS_TO_DISCARD=1 with SN PC2
    • With deals
    storage/cache/s-t01008-19
    ls
    p_aux  sc-02-data-tree-r-last.dat
    -----
    /storage/unsealed
    ls
    fetching s-t01008-19
    
    • Make sure it PoSts
    • CC
      storage/cache/s-t01008-24
      ls
      p_aux  sc-02-data-tree-r-last.dat
      
      • Make sure it PoSts
      • Snap up that sector
      /storage/update
      ls
      fetching  s-t01008-0  s-t01008-24
      ------
      storage/update-cache/s-t01008-24
      ls
      p_aux  sc-02-data-tree-r-last.dat
      
      • Make sure it PoSts
  • Make sure that sectors sealed with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD=1 can be posted with ffi with FFI_USE_FIXED_ROWS_TO_DISCARD=1

@vmx
Copy link
Contributor

vmx commented Nov 28, 2023

  • Make sure that sectors sealed with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD=1 can be posted with ffi with FFI_USE_FIXED_ROWS_TO_DISCARD=1

With default values it's expected to work. But if you set FIL_PROOFS_ROWS_TO_DISCARD to a value other than 2 on 32GiB sectors it might fail (it might not be an issue on 512MiB sectors though). => It's expected that FFI_USE_FIXED_ROWS_TO_DISCARD is either set or not, you shouldn't operate in a world where some parts use it and some don't.

@rjan90
Copy link
Contributor

rjan90 commented Nov 28, 2023

Make sure that sectors sealed with FFI built without FFI_USE_FIXED_ROWS_TO_DISCARD=1 can be posted with ffi with FFI_USE_FIXED_ROWS_TO_DISCARD=1

It's expected that FFI_USE_FIXED_ROWS_TO_DISCARD is either set or not, you shouldn't operate in a world where some parts use it and some don't.

I think the testing here is more targeted towards current storage providers that has been running on mainnet without setting any FIL_PROOFS_ROWS_TO_DISCARD configs/default settings. And we want to make sure that they are now able to use SupraSeal PC2 with the FFI_USE_FIXED_ROWS_TO_DISCARD=1 setting, even if they have commited and are proving sectors that have the t_aux file.

So just to be entirely sure, that is expected to work right @vmx ?

@vmx
Copy link
Contributor

vmx commented Nov 28, 2023

So just to be entirely sure, that is expected to work right @vmx ?

Yes that is expected to work.

@vmx
Copy link
Contributor

vmx commented Nov 28, 2023

Ideally they just run with FFI_USE_FIXED_ROWS_TO_DISCARD=1 even if they don't use SupraSeal.

@snadrus
Copy link
Collaborator

snadrus commented Nov 29, 2023

@magik6k

@rjan90
Copy link
Contributor

rjan90 commented Dec 1, 2023

All the test cases has been completed now.. There are some unreliabilites when doing deal sectors calling the SupraSeal PC2 binary through the lotus-worker, but those can be investigated and fixed in a subsequent PR, and should not block this PR from landing.

Running with/and without FFI_USE_FIXED_ROWS_TO_DISCARD=1 & the new proofs release is working well across the range of sealing-types and all sectors that has been sealed either with or without it are able to PoSt, and SupraSeal PC2 works well when sealing CC-sectors (both SynthPoRep and non-SynthPoRep)

@rjan90 rjan90 merged commit 2c00b5d into master Dec 1, 2023
87 checks passed
@rjan90 rjan90 deleted the feat/snpc2 branch December 1, 2023 11:32
@Stebalien Stebalien mentioned this pull request Jan 8, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supranational brownfield PC2 <> Lotus-Miner
6 participants