Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subscriber: rewrite Registry to use static arrays #2230

Open
davidbarsky opened this issue Jul 19, 2022 · 13 comments
Open

subscriber: rewrite Registry to use static arrays #2230

davidbarsky opened this issue Jul 19, 2022 · 13 comments

Comments

@davidbarsky
Copy link
Member

Feature Request

Crates

  • tracing-subscriber

Motivation

The typemap-based used by the Registry's to store extensions is a typemap that erases types, places them into a Box, and stores the box into a HashMap. While this makes it easy to get started with the Registry, this approach does have a noticeable overhead when used in contexts such as profiling (e.g., bevyengine/bevy#4892).

Proposal

Instead of using a HashMap for extensions, Layer types would be required to pre-declare extension types they'd be storing in the Registry and use arrays to store extensions, avoiding the boxing (and potentially, the HashMap lookup).

Alternatives

Don't do this, but I think that the performance benefits of this are non-trivial.

(Sorry this issue is a bit short, I had to run and most of the implementation details will be discovered while creating this API.)

@CAD97
Copy link
Contributor

CAD97 commented Jul 25, 2022

I think it should be possible to use sharded_slab::Pool to store the extensions as well as the always-present data, and to do so in a backwards-compatible manner.

The backwards-compatible scheme uses a global RwLock that's only write-locked when a new extension type is registered. This will result in higher initial contention as layers first access their extensions, but should easily amortize out, and an optional API to pre-register types can remove the need for write locks.

The exact same API works to require pre-registration; panic if the extension type is not registered, as this is clear programmer error.

Super roughly, the idea is to invert from AoS Pool<AnyMap![_]> to SoA AnyMap![Pool<_>]:

Registry:
  spans: Pool<SpanDataInner>
  extensions: RwLock<AnyMap![Pool<Option<_>>]>
  ..

SpanData:
  data: PoolRef<SpanDataInner>
  extensions: &RwLock<AnyMap![Pool<Option<_>>]>

SpanDataInner:
  extensions: SmallMap<TypeId, usize>,
  ..

SpanData.insert(self, val: T):
  ext = None
  exts = self.extensions.read()
  if exts.get(T) is None:
    exts = self.extensions.write()
    exts.insert(T, RwLock(Pool()))
    exts = self.extensions.read()
  ext = exts.get(T)
  ix = ext.insert(Some(T))
  self.data.extensions.insert(T, ix)

SpanData.get(self, T):
  exts = self.extensions.read()
  ext = exts.get(T)?
  ix = self.data.extensions.get(T)?
  ext.get(ix).as_ref()

@hawkw
Copy link
Member

hawkw commented Jul 25, 2022

I think it should be possible to use sharded_slab::Pool to store the extensions as well as the always-present data, and to do so in a backwards-compatible manner.

Yes, a scheme like what you're describing here has been on my TODO list for a while. I think I have a partially-finished branch making a change like this somewhere.

@hawkw
Copy link
Member

hawkw commented Jul 25, 2022

@CAD97, if you're interested in working on that, I would love a PR, and I can try to find what I wrote a while ago as a starting point, if you'd like it?

@CAD97
Copy link
Contributor

CAD97 commented Jul 25, 2022

I'd be happy to take a stab at it if you can find your old diff (and maybe even if you can't)

@CAD97
Copy link
Contributor

CAD97 commented Jul 26, 2022

Oh no, the problem with this approach is that ExtensionsMut needs to have and gives out &mut access to all of the extensions on that span, so it needs to hold the lock on each one... this is theoretically solvable, but it reduces the gain a significant amount, as ExtensionsMut needs to hold access to AnyMap![T => RefMut<T>] and isn't sufficient with just &mut AnyMap![T => usize].

Also, we don't have a Pool::get_mut; is that just an oversight or a limitation? Similarly, a remove (blocking version of clear) is necessary for this impl scheme.


If it's possible to store either of

  • type-erased sharded_slab::pool::RefMut<Opaque>
  • dyn-typed sharded_slab::pool::RefMut<dyn Any>

Then that storage to do the locking can be put into DataInner and borrowed by ExtensionsMut. Otherwise, ExtensionsMut would have to hold the anymap itself.

Alternatively, we could prove that extending &mut T beyond the lock is okay if ExtensionsMut is around since that theoretically does lock access?


Also, I don't think it's possible for outside crates to implement SpanData;

  • feature = "std" isn't purely additive, since it adds required methods to SpanData
  • SpanData::extensions returns a type which is not externally constructible
  • (Actually, SpanData is implementable by outside crates if the std feature isn't set)

And this means the only useful implementation of LookupSpan (i.e. one that doesn't just always fail lookup) is one that wraps another LookupSpan.

@CAD97
Copy link
Contributor

CAD97 commented Jul 26, 2022

So this is where I got in a quick test; the Data portion of it works, but the interface of Extensions[Mut] causes problems

master...CAD97:tracing:pool-exts

@CAD97
Copy link
Contributor

CAD97 commented Jul 26, 2022

Concept for a smaller incremental improvement(?):

Currently (IIUC), the AnyMap hashmap is reused, but the actual extensions' boxes aren't. A small improvement then is instead of AnyMap![T => T], store AnyMap![T => Option<T>] to reuse the extensions' boxes.

I'm going to see if this approach works sometime this week as well. Perhaps these boxes can even be free-list-pooled and moved into the extensions map (rather than coming as part of the span data), even if pooling the extensions directly didn't work out.

@davidbarsky
Copy link
Member Author

I wouldn't rule out a breaking change, @CAD97—if the performance improvement is as good as I think that it might be, then I think we should seriously considering release a tracing-subscriber 0.4 (alongside some other changes that have been queued up).

@hawkw
Copy link
Member

hawkw commented Jul 27, 2022

I definitely want to get an 0.4 out at some point, so if we can come up with a really solid implementation of this that requires a breaking change, I would happily cut an 0.4 release for it.

@davidbarsky
Copy link
Member Author

I think there's a few other items that merit a breaking release: https://github.com/tokio-rs/tracing/milestone/11

@CAD97
Copy link
Contributor

CAD97 commented Jul 27, 2022

breaking changes? #1612? #2048!?

I'll revive the branch and see what I can do to chase the lifetime errors out then 🙂

Also, https://twitter.com/mycoliza/status/1530356065610567686?s=20&t=q2_dwEk3_1GnWorF1F4pBg wants to get into a tracing-subscriber bump (in tracing-filter as CAD97/tracing-filter#13) to have EnvFilter utilize LookupSpan.

@davidbarsky
Copy link
Member Author

davidbarsky commented Jul 27, 2022

Oh, sorry! I meant that the breakage would be in tracing-subscriber, which can maybe be used as a dry-run-of-sorts for cross-version compatibility of Layers before employing that in tracing 0.2.

@davidbarsky
Copy link
Member Author

For context, the Bevy reports that they incur about 10μs per each span/event when profiling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants