Profiling the tx builder showed that even after we removed the
fake_key_root construction in #214 due to terrible performance in asm.js
(#213) that the cryptoxide calls within `fake_full_tx()` were up a
ridiculous amount of the runtime. I investigated this as we had a report
that it was taking up to hundreds of milliseconds just to make a tx and
I wanted to see if there would be any significant improvement by
migrating to an idiomatic rust API as discussed in #276 but from a
purely performance perspective in a release build (opt-level=3) this seemed to
be pretty minor, even with the unnecessary cloning forced by this
inside of the implementation.
This will need to be slightly modified once #273 gets merged as there are
some conflicts.
Previous perf results from building 100,000 simple txs and serializing
them:
```
- 53.64% [.] <&cryptoxide::curve25519::Fe as
core::ops::arith::Mul>::mul
- 53.63% 0xffffffffffffffff
+ 23.03% cardano_serialization_lib::tx_builder::min_fee
- 21.32%
cardano_serialization_lib::tx_builder::TransactionBuilder::add_change_if_needed
- 15.21%
cardano_serialization_lib::tx_builder::TransactionBuilder::fee_for_output
cardano_serialization_lib::tx_builder::min_fee
+ cardano_serialization_lib::tx_builder::fake_full_tx
+ 6.11%
cardano_serialization_lib::tx_builder::TransactionBuilder::min_fee
(inlined)
- 7.78% ser_lib_perf::main
+ 6.11%
cardano_serialization_lib::tx_builder::TransactionBuilder::build
+ 1.67%
cardano_serialization_lib::tx_builder::TransactionBuilder::add_change_if_needed
+ 1.46% std::rt::lang_start_internal
- 19.40% [.] cryptoxide::curve25519::Fe::square
- 0xffffffffffffffff
- 8.34% cardano_serialization_lib::tx_builder::min_fee
cardano_serialization_lib::tx_builder::TransactionBuilder::build
cardano_serialization_lib::tx_builder::TransactionBuilder::build_and_size
+ cardano_serialization_lib::tx_builder::fake_full_tx
- 7.41%
cardano_serialization_lib::tx_builder::TransactionBuilder::add_change_if_needed
+ 5.58%
cardano_serialization_lib::tx_builder::TransactionBuilder::fee_for_output
+ 1.83%
cardano_serialization_lib::tx_builder::TransactionBuilder::min_fee
(inlined)
+ 2.77% ser_lib_perf::main
+ 0.88% std::rt::lang_start_internal
+ 7.80% [.] cryptoxide::curve25519::GePrecomp::maybe_set
+ 5.13% [.] cryptoxide::curve25519::Fe::invert
+ 3.98% [.] <&cryptoxide::curve25519::GeP3 as
core::ops::arith::Add<&cryptoxide::curve25519::GePrecomp>>::add
+ 2.84% [.] cryptoxide::curve25519::GePrecomp::select
+ 1.22% [.] cryptoxide::curve25519::ge_scalarmult_base
+ 1.00% [.]
cryptoxide::sha2::impl512::reference::digest_block_u64
0.47% [.] cryptoxide::curve25519::GeP2::dbl
0.13% [.] cryptoxide::curve25519::sc_muladd
0.12% [.] cryptoxide::curve25519::Fe::to_bytes
0.12% [.] cryptoxide::curve25519::sc_reduce
0.10% [.] cbor_event::se::Serializer<W>::write_type
0.07% [.]
cardano_serialization_lib::tx_builder::fake_full_tx
```
which shows that almost the entire runtime was spent on cryptoxide calls
which were only necessary for `fake_full_tx()`.
Running it again afterwards there is no overwhelming bottleneck anymore
and the remaining runtime was fairly distributed.