Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DebugBinds/Query: Reduce monomorphization overhead #4417

Merged
merged 5 commits into from
Jan 3, 2025

Conversation

Turbo87
Copy link
Contributor

@Turbo87 Turbo87 commented Jan 2, 2025

The output of cargo llvm-lines --package diesel_tests --test integration_tests --profile test --features postgres showed the Debug and Display implementations of the DebugBinds and DebugQuery structs as two of the main code size reasons, slowing down the compilation of applications. This commit significantly reduces the LLVM IR lines of the two structs, which should hopefully improve compile times a little bit (see commit messages for detailed numbers).

/cc @weiznich

Related:

Comment on lines 64 to 69
// This is not using the `?` operator to reduce the code size of this
// function, which is getting copies a lot due to monomorphization.
let query = match self.query() {
Ok(query) => query,
Err(err) => return Err(err),
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use a tri!() macro instead (see https://github.com/serde-rs/serde/blob/v1.0.217/serde/src/lib.rs#L290-L301), but that slightly bumped the numbers upwards again, so I'm not sure whether it's worth it for the readability benefit.

The output of `cargo llvm-lines --package diesel_tests --test integration_tests --profile test --features postgres` showed this struct as one of the main code size reasons, slowing down the compilation of applications. This commit brings the LLVM IR lines of the `Debug` implementation from 104763 down to 46809 (-55.3%).
The output of `cargo llvm-lines --package diesel_tests --test integration_tests --profile test --features postgres` showed this struct as one of the main code size reasons, slowing down the compilation of applications. This commit brings the LLVM IR lines of the `Debug` and `Display` implementations from 150633 down to 139574 (-7.3%).
@weiznich weiznich requested a review from a team January 3, 2025 09:34
@weiznich
Copy link
Member

weiznich commented Jan 3, 2025

Thanks for opening this PR. Such optimizations are really appreciated. I wonder whether it might be meaningful to use a similar strategy here to what we already do for the statement cache:

/// Implemented for all `QueryFragment`s, dedicated to dynamic dispatch within the context of
/// `statement_cache`
///
/// We want the generated code to be as small as possible, so for each query passed to
/// [`StatementCache::cached_statement`] the generated assembly will just call a non generic
/// version with dynamic dispatch pointing to the VTABLE of this minimal trait
///
/// This preserves the opportunity for the compiler to entirely optimize the `construct_sql`
/// function as a function that simply returns a constant `String`.
#[allow(unreachable_pub)]
#[cfg_attr(
docsrs,
doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
)]
pub trait QueryFragmentForCachedStatement<DB> {
/// Convert the query fragment into a SQL string for the given backend
fn construct_sql(&self, backend: &DB) -> QueryResult<String>;
/// Check whether it's safe to cache the query
fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool>;
}
impl<T, DB> QueryFragmentForCachedStatement<DB> for T
where
DB: Backend,
DB::QueryBuilder: Default,
T: QueryFragment<DB>,
{
fn construct_sql(&self, backend: &DB) -> QueryResult<String> {
let mut query_builder = DB::QueryBuilder::default();
self.to_sql(&mut query_builder, backend)?;
Ok(query_builder.finish())
}
fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool> {
<T as QueryFragment<DB>>::is_safe_to_cache_prepared(self, backend)
}
}

That would mean that instead of storing a reference to the generic query type we would rather just store a reference to a trait object in DebugQuery and DebugBinds, which should hopefully allow to minimize the number of generated lines of code even further. If you have time and motivation to try that this would be really appreciated, otherwise it shouldn't be a problem to merge this PR as it is.

@Turbo87
Copy link
Contributor Author

Turbo87 commented Jan 3, 2025

instead of storing a reference to the generic query type we would rather just store a reference to a trait object in DebugQuery and DebugBinds

yeah, I thought about that as well, but I wasn't sure whether the performance overhead of the Box would be acceptable (or how to measure it... 😅)

This commit replaces the generic `&'a T` field in `DebugBinds` with `&'a dyn QueryFragment<DB>`. This causes the number of LLVM copies of this struct to plummet from 743 to 1, since it is now only compiled once per backend implementation.
@Turbo87
Copy link
Contributor Author

Turbo87 commented Jan 3, 2025

hmm, in general this seems to work, but https://github.com/diesel-rs/diesel/blob/3f69aa4570c505a94889e3c3de4e83e13bc6a37d/diesel/src/query_builder/insert_statement/insert_with_default_for_sqlite.rs isn't particularly happy about the removal of the generic T parameter. any advice on how to deal with this special case?

@weiznich
Copy link
Member

weiznich commented Jan 3, 2025

Thanks for the update. I think removing the generic T type is unfortunately not possible, as that would be a breaking change. What we can do is to just put that type into a PhantomData and then have all the methods on that generic type call some static method without the generic parameter T internally. That should reduce the amount of duplication to the minimum while keeping things stable to the outside.

For a future diesel 3.0 version we might want to remove the T type there, but that's something for the future.,

@Turbo87 Turbo87 force-pushed the mono branch 2 times, most recently from 5c7c508 to e24912a Compare January 3, 2025 10:59
@Turbo87
Copy link
Contributor Author

Turbo87 commented Jan 3, 2025

just put that type into a PhantomData

that didn't work since the code in https://github.com/diesel-rs/diesel/blob/3f69aa4570c505a94889e3c3de4e83e13bc6a37d/diesel/src/query_builder/insert_statement/insert_with_default_for_sqlite.rs is accessing self.query.foo directly, but I've found another way to shrink it down :)

@Turbo87
Copy link
Contributor Author

Turbo87 commented Jan 3, 2025

total LLVM IR lines as reported by the above command:

  • before: 2_024_074
  • after: 1_740_806 (-14%)

Copy link
Member

@weiznich weiznich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this and thanks for providing these numbers ❤️

@weiznich weiznich added this pull request to the merge queue Jan 3, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 3, 2025
@Turbo87
Copy link
Contributor Author

Turbo87 commented Jan 3, 2025

/usr/bin/ld: cannot find -lsqlite3: No such file or directory

that seems unrelated to my changes 😅

@weiznich
Copy link
Member

weiznich commented Jan 3, 2025

/usr/bin/ld: cannot find -lsqlite3: No such file or directory

that seems unrelated to my changes 😅

Yes, seems like the CI image got updated and something is now missing. I need to fix that 😞

@weiznich weiznich enabled auto-merge January 3, 2025 18:01
@weiznich weiznich added this pull request to the merge queue Jan 3, 2025
Merged via the queue into diesel-rs:master with commit cec3cd0 Jan 3, 2025
48 checks passed
@Turbo87 Turbo87 deleted the mono branch January 4, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants