Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Allow composable tower stack for Connector service #2490

Closed
jlizen opened this issue Dec 2, 2024 · 5 comments · Fixed by #2496
Closed

[Feature request] Allow composable tower stack for Connector service #2490

jlizen opened this issue Dec 2, 2024 · 5 comments · Fixed by #2496
Labels
C-feature Category: feature. This is requesting a new feature.

Comments

@jlizen
Copy link
Contributor

jlizen commented Dec 2, 2024

I'd be interested in contributing this feature. Sharing for feedback, hoping to get rolling with impl next week.

Feature request

Currently, the Connector is modeled as a tower service:

impl Service<Uri> for Connector {
    type Response = Conn;
    type Error = BoxError;
    type Future = Connecting;

    fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }

    fn call(&mut self, dst: Uri) -> Self::Future {
        log::debug!("starting new connection: {dst:?}");
        let timeout = self.timeout;
        for prox in self.proxies.iter() {
            if let Some(proxy_scheme) = prox.intercept(&dst) {
                return Box::pin(with_timeout(
                    self.clone().connect_via_proxy(dst, proxy_scheme),
                    timeout,
                ));
            }
        }

        Box::pin(with_timeout(
            self.clone().connect_with_maybe_proxy(dst, false),
            timeout,
        ))
    }
}

However, callers do not have a way to wrap this service with tower layers. Adding an attachment point would allow a lot of flexibility without many changes required on reqwest's end.

Examples applications:

  • send connection tasks to a different threadpool to optimize for blocking futures
  • custom logging and telemetry based on connection establishment

The context of this ask is that I've been supporting some teams at my work that are running into performance issues when they are establishing very large numbers of connections very quickly with reqwest. This is because the tls backend crate calls, while modeled as async, ultimately are compute-heavy and block the executor. The workaround we've identified is to spawn a secondary threadpool and delegate those blocking futures to it via channel. Initial testing of this approach via a local fork of reqwest has been promising.

We could accommodate the above use case with extensible tower layers on the connection, without being too opinionated about how to use them. I'm happy to also contribute example code showing the threadpool impl, for users that face similar bottlenecks. (I'm also working on some changes on the async tls crates to support this case better, natively).

Proposed Approach

We could add a method to the client builder that is something like: `

pub fn connection_layer(layer: impl Service<Uri>)->ClientBuilder {
    self.config.connection_layers.push(Box::new(layer));
    self
}

And then, inside the build() method, or maybe in a build method inside Connector, we could construct the inner service (how it currently is), add in any layers, in order, and then add in a final TimeoutLayer if present based on the pre-existing timeout config.

Questions / possible concerns

I'm on the fence about whether to model this tower service with all the layers as an inner field inside Connector, or to wrap the entire Connector. I need to play around with the impl still to see what works best. Open to feedback.

I also don't want to unnecessarily introduce dynamic dispatch in the case of no layers, which I can take pains to avoid at the cost of slightly more complexity. I'm not sure there is a way around introducing it otherwise, unless we want to force callers to use eg a specific set of known layers, which I don't think is a good idea.

@jlizen jlizen changed the title Feature request: Allow composable tower stack for Connector service [Feature request] Allow composable tower stack for Connector service Dec 2, 2024
@seanmonstar seanmonstar added the C-feature Category: feature. This is requesting a new feature. label Dec 3, 2024
@jlizen
Copy link
Contributor Author

jlizen commented Dec 7, 2024

I'm bumping into apparently a well-known limitation of trying to build type-erased tower service stacks for hyper - ultimately we need the service to be both Clone + Sync. Currently we are using BoxCloneService, but it is not Sync, unlike BoxService.

There is a PR open that addresses this to some extent: https://github.com/tower-rs/tower/pull/777/files
It is somewhat restrictive in that it requires all inner service bounds to be Sync.

We would prefer to be able to stick to the less restrictive way of getting Sync, via SyncWrapper, which is what the other type erasing wrappers are using. That essentially lets us tell the compiler, we can guarantee this struct is Sync because we locked down its access to only via &mut, meaning only part of the program can access it at once therefore it can't have synchronization issues cross-threads.

That loses Clone though. More discussion on why SyncWrapper<T> can't implement Clone if T is Clone here: Actyx/sync_wrapper#10

The gist is, we no longer are guaranteeing that the only access is mutable and therefore exclusive, since Clone accepts a shared reference. You would need to use a mutex or something like that, which is less restrictive than enforcing Sync bounds on contained services, but adds overhead.

@jlizen
Copy link
Contributor Author

jlizen commented Dec 8, 2024

Pinged on Discord to help get it reviewed: https://discord.com/channels/500028886025895936/1085253097460351066/1315105563889238087

I'm going to go ahead and pull that in locally. I prefer that to messing with synchronization primitives like Mutex. I don't imagine very many layers are relevant to the connection service, will be at odds with Sync bounds.

I'll keep an eye out for that being a sharp edge though as I construct some sample layer stacks.

@jlizen
Copy link
Contributor Author

jlizen commented Dec 8, 2024

As a side note, this will require bumping tower up from 0.4 to 0.5, but I'm not seeing much of concern so far.

@jlizen
Copy link
Contributor Author

jlizen commented Dec 8, 2024

The above PR led me to tower-request , which alekseysidorov@ has been working on.

The approach there is different, essentially it is wrapping the entire client flow rather than a specific subcomponent like the connector.

I think there probably is a place for that as well, not all callers will want to go so fine grained. They might just want for instance an overall concurrency limiter. I know @seanmonstar is thinking of some broader tower investments with reqwest.

Unfortunately I don't see any good contact info for alekseysidorov@ at first glance but will try to track him down to see if he is interested in joining forces. Seems like he is trying to solve similar problems.

@jlizen
Copy link
Contributor Author

jlizen commented Dec 9, 2024

I have it working by wrapping the custom layers in BoxCloneSyncService, but @arielb1 pointed out we could probably concretely type the whole thing without forcing callers to specify a generic, by defaulting to using the Identity layer. Good point! I had initially avoided the generic out of concern for breaking the API with the new generic bound.

That adds a bit of generic pollution internal to the library, but I think we can contain it, and then this removes the extra allocations per layer. Will play around with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature Category: feature. This is requesting a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants