Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Facilitate tracing from WASM #5826

Closed
wants to merge 91 commits into from
Closed

Facilitate tracing from WASM #5826

wants to merge 91 commits into from

Conversation

mattrutherford
Copy link
Contributor

@mattrutherford mattrutherford commented Apr 29, 2020

Add --wasm-tracing CLI flag to enable tracing from WASM.

This must be used in conjunction with --tracing-targets (and optionally --tracing-receiver).

Add 2 new host functions to allow tracing span enter and exit. The tracing API requires span target and name to be const, so it's not possible to create spans directly with the actual target and name; instead all are created with a single pre-defined identifier, with the span's fields containing the actual target and name, which is then patched in the tracing Subscriber in client.

Each span is stored in the TracingProxy along with it's associated guard until a call to exit is received, at which point it's dropped, triggering a span exit call to the tracing Subscriber.

Each host function call can take around 300ns, although only the second call will be included in the measurement. The total cost of a wasm trace that's incorporated into the measurement is potentially around 0.4µs.

@parity-cla-bot
Copy link

It looks like @mattrutherford signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

@mattrutherford mattrutherford added the A3-in_progress Pull request is in progress. No review needed at this stage. label Apr 29, 2020
@mattrutherford
Copy link
Contributor Author

mattrutherford commented Apr 29, 2020

Here are some graphs for the first few minutes of running a dev chain:

Screenshot 2020-04-29 at 14 36 25
Screenshot 2020-04-29 at 14 36 36

@mattrutherford mattrutherford added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels May 6, 2020
@mattrutherford mattrutherford changed the title [WIP] sp-tracing Proxy - to facilitate tracing span creation and exit from WASM Proxy to facilitate tracing from WASM environment May 6, 2020
primitives/tracing/src/proxy.rs Outdated Show resolved Hide resolved
primitives/tracing/src/proxy.rs Outdated Show resolved Hide resolved
primitives/tracing/src/span_dispatch.rs Outdated Show resolved Hide resolved
primitives/tracing/src/proxy.rs Outdated Show resolved Hide resolved
frame/support/src/wasm_tracing.rs Outdated Show resolved Hide resolved
frame/support/src/wasm_tracing.rs Outdated Show resolved Hide resolved
primitives/tracing/src/span_dispatch.rs Outdated Show resolved Hide resolved
@mattrutherford
Copy link
Contributor Author

mattrutherford commented May 7, 2020

Just updated this PR to use a different way to create tracing spans, rather than rely on a brittle mechanism of pre-registered spans, this way reserves a special span name WASM_TRACE_IDENTIFIER to signal to the subscriber in client::tracing that the actual span name and target should instead be extracted from the spans' associated Fields. While still a compromise, I think that it's a more manageable one - just requiring that we reserve 3 identifiers - one for span name and two more for associated fields.

Copy link
Contributor

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that much about how tracing works, but looks good to me if it works :)

client/tracing/src/lib.rs Outdated Show resolved Hide resolved
@mattrutherford
Copy link
Contributor Author

I don't know that much about how tracing works, but looks good to me if it works :)

It works and while not as pure from the POV of tracing, I think it's a better solution for substrate at the moment.

@mattrutherford mattrutherford added the C1-low PR touches the given topic and has a low impact on builders. label Jun 11, 2020
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nitpicks, after that we should be okay.

client/tracing/src/lib.rs Outdated Show resolved Hide resolved
client/tracing/src/lib.rs Outdated Show resolved Hide resolved
client/tracing/src/lib.rs Outdated Show resolved Hide resolved
client/tracing/src/lib.rs Show resolved Hide resolved
client/tracing/src/lib.rs Outdated Show resolved Hide resolved
client/tracing/src/lib.rs Outdated Show resolved Hide resolved
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
client/tracing/src/lib.rs Outdated Show resolved Hide resolved
@bkchr
Copy link
Member

bkchr commented Jun 15, 2020

Ty for your work! :)

@mattrutherford
Copy link
Contributor Author

Thanks all for helping get this one done!

Made final check with Polkadot master and should be ready to merge now.

Just reiterating that this runtime will be incompatible with earlier versions of the client due to the new host functions utilised by it. (Client upgrade is critical on runtime upgrade.)

@gavofyork
Copy link
Member

hmm... is there no way to make it compatible assuming we're not using the new runtime functions? i can't really merge this any time with our current need to have some degree of release stability and a fairly regular upgrade schedule across two live chains.

@gavofyork gavofyork added A1-onice and removed A0-please_review Pull request needs code review. labels Jun 16, 2020
@bkchr
Copy link
Member

bkchr commented Jun 16, 2020

But what is the problem? We just need to make sure we have a node release before applying the runtime upgrade.

@mattrutherford
Copy link
Contributor Author

Just to also clarify that the client is compatible with earlier runtime versions (currently syncing Polkadot CC1 with client based from this branch).

@gavofyork
Copy link
Member

gavofyork commented Jun 17, 2020

But what is the problem? We just need to make sure we have a node release before applying the runtime upgrade.

The problem is that we have two production networks with several hundreds of validators between them, as well as a tendency to do runtime upgrades from master branch every day or two. As soon as I merge this, then I won't be able to make a runtime ugrade on Polkadot until everybody has upgraded their nodes, same for Kusama. Requiring validators to upgrade on such a short timespan, or hamstringing myself so that I can't push upgrades from master branch is not something that I want to do outside of a real emergency.

@mattrutherford
Copy link
Contributor Author

@gavofyork Would you agree to splitting this PR into two?

1st stage include the new host functions and related features in client (minus CLI options).

2nd stage at some point in future we can then add the runtime features to make use of them, also adding the CLI parameters to enable it.

This would allow us to give whatever window seems appropriate to allow everyone to upgrade.

@gavofyork
Copy link
Member

Perfect.

@gnunicorn
Copy link
Contributor

closing because of inactivity.

@gnunicorn gnunicorn closed this Sep 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants