availability-recovery: run CPU intensive work on separate thread #7411

sandreim · 2023-06-21T14:01:02Z

Currently reconstructed_data_matches_root and reconstruct_v1 burn up to 1s of CPU time in the context of recovery-task: https://github.com/paritytech/polkadot/blob/master/node/network/availability-recovery/src/lib.rs#L956 .

We should investigate why this takes so much time for only 2.5MB data. Also we need to move this CPU intensive work in a separate thread (spawn blocking/rayon/spawn thread).

First image is reconstructed_data_matches_root + reconstruct_v1, second is just reconstructed_data_matches_root

The text was updated successfully, but these errors were encountered:

burdges · 2023-06-21T16:41:30Z

Also, kagome C implementation of our erasure coding is faster than our Rust implementation.

ordian · 2023-06-21T18:11:39Z

These numbers don't match what I observed on https://github.com/paritytech/polkadot/blob/master/erasure-coding/benches/README.md, but that was on a faster CPU perhaps. Perhaps an easy (?) win would be refactoring the code: https://github.com/paritytech/reed-solomon-novelpoly/ to be autovectorizing-friendly (SIMD).

sandreim · 2023-06-22T07:55:19Z

Yes, it might be a slower CPU since we are running this on n2d GCP instances.

koute · 2023-06-27T15:15:08Z

Perhaps an easy (?) win would be refactoring the code: https://github.com/paritytech/reed-solomon-novelpoly/ to be autovectorizing-friendly (SIMD).

What might be worth testing is to try to compile with -C target-cpu=native and see if that makes a difference, since the baseline won't use more recent SIMD instructions (which can provide quite a bit of a speedup).

vstakhov · 2023-06-27T17:57:55Z

I doubt that auto-vectorization with just a compiler will speed GF^2 operations significantly. It might be interesting to try avx-512 GFNI instructions as they are designed exactly for that purposes.

sandreim added I10-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance. labels Jun 21, 2023

sandreim mentioned this issue Jun 21, 2023

Network scalability: 500 parachain validators and 100 cores (async backing enabled) paritytech/roadmap#26

Open

26 tasks

sandreim self-assigned this Jun 22, 2023

sandreim mentioned this issue Jun 22, 2023

availability-recovery: move cpu burners in blocking tasks #7417

Merged

3 tasks

paritytech-processbot bot closed this as completed in #7417 Jul 4, 2023

sandreim mentioned this issue Jul 4, 2023

availability-recovery optimizations paritytech/polkadot-sdk#605

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

availability-recovery: run CPU intensive work on separate thread #7411

availability-recovery: run CPU intensive work on separate thread #7411

sandreim commented Jun 21, 2023 •

edited

Loading

burdges commented Jun 21, 2023

ordian commented Jun 21, 2023

sandreim commented Jun 22, 2023

koute commented Jun 27, 2023

vstakhov commented Jun 27, 2023

availability-recovery: run CPU intensive work on separate thread #7411

availability-recovery: run CPU intensive work on separate thread #7411

Comments

sandreim commented Jun 21, 2023 • edited Loading

burdges commented Jun 21, 2023

ordian commented Jun 21, 2023

sandreim commented Jun 22, 2023

koute commented Jun 27, 2023

vstakhov commented Jun 27, 2023

sandreim commented Jun 21, 2023 •

edited

Loading