Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the rust builder image #152

Closed
codefromthecrypt opened this issue Mar 30, 2021 · 3 comments
Closed

Optimize the rust builder image #152

codefromthecrypt opened this issue Mar 30, 2021 · 3 comments

Comments

@codefromthecrypt
Copy link
Contributor

codefromthecrypt commented Mar 30, 2021

Currently, "envoy extension build" on rust matrices are extremely slow. This hurts the dev experience and ties up CI and/or requiring careful consideration of external caches.

Here's a run from a latest macbook pro

--- PASS: TestGetEnvoyExtensionBuild (684.48s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.filters.http,_language=rust (217.16s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.filters.http,_language=tinygo (11.11s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.filters.network,_language=rust (212.49s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.filters.network,_language=tinygo (11.55s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.access_loggers,_language=rust (221.18s)
    --- PASS: TestGetEnvoyExtensionBuild/category=envoy.access_loggers,_language=tinygo (11.00s)

While some of this is unavoidable, I think our builder image hasn't been optimized and there are possibly many things to look at here.

One that could be handy is to use the somewhat typical "first run" inside the Dockerfile. Running a baseline command inside docker can insure users of the image don't need to do a lot of excessive downloading. This could be a cargo command.

Again, I'm not sure what the best change should be, but some analysis should happen because the status quo bleeds dev time.

@codefromthecrypt
Copy link
Contributor Author

this issue was obfuscated by overrides done in shell scripts. I opened #153 to hopefully save the next person from a similar fate. Regardless, there's likely use looking to pre-seed the image itself, as caching like this is complex and hides the poor execution time end users will perceive. That is assuming the runs are faster after some sort of seed operation.

Another option could be to consider the normal rust sdk to see if it is better optimized, as there's likely some large effort maintaining a fork and also optimizing it https://github.com/proxy-wasm/proxy-wasm-rust-sdk

codefromthecrypt pushed a commit that referenced this issue Apr 1, 2021
Before, end-to-end test executions were very slow. Originally, there was
only one extension language. When this changed, the e2e tests took
longer than before, and there was no quick way to run all tests when
only changing one language. This is because the test case managed the
matrix of extension language.

The change here is to pull the extension language out to a different ENV
variable: `E2E_EXTENSION_LANGUAGE`. CI now manages this externally,
which allows quick identification of errors or performance issues on a
per-language basis. It also increases the incentive to run `make e2e`,
by removing the "language I'm not using" penalty.

It is understood that currently end-to-end tests happen to only test
extensions and that this optimizes for the current codebase, not a
future one that may have other matrix concerns. If such a topic happens
later, we can consider alternate approaches, and meanwhile enjoy the
isolation.

See #152

Signed-off-by: Adrian Cole <adrian@tetrate.io>
codefromthecrypt pushed a commit that referenced this issue Apr 2, 2021
…ocker

This moves special-casing to rust+macOS cell in a normal build matrix.
Issue #152 still needs investigation on how to get around the slowness.

Fixes #153
Fixes #145

Signed-off-by: Adrian Cole <adrian@tetrate.io>
mathetake pushed a commit that referenced this issue Apr 5, 2021
…ocker (#159)

This moves special-casing to rust+macOS cell in a normal build matrix.
Issue #152 still needs investigation on how to get around the slowness.

Fixes #153
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Takeshi Yoneda <takeshi@tetrate.io>
@codefromthecrypt
Copy link
Contributor Author

@lizan FWIW here's what's going on per latest run:

grep Finished 3_Run\ e2e\ tests\ \(macos-latest\,\ rust\).txt 
2021-04-21T23:15:33.2248320Z     Finished dev [unoptimized + debuginfo] target(s) in 1m 52s
2021-04-21T23:17:18.6423800Z     Finished dev [unoptimized + debuginfo] target(s) in 22.57s
2021-04-21T23:19:07.5972280Z     Finished dev [unoptimized + debuginfo] target(s) in 21.44s
2021-04-21T23:22:07.5313240Z     Finished dev [unoptimized + debuginfo] target(s) in 22.18s
2021-04-21T23:23:55.2104940Z     Finished dev [unoptimized + debuginfo] target(s) in 21.23s
2021-04-21T23:25:45.9745390Z     Finished dev [unoptimized + debuginfo] target(s) in 20.94s
2021-04-21T23:35:36.7062050Z     Finished test [unoptimized + debuginfo] target(s) in 8m 30s
2021-04-21T23:43:43.4438180Z     Finished test [unoptimized + debuginfo] target(s) in 7m 51s
2021-04-21T23:51:39.2929100Z     Finished test [unoptimized + debuginfo] target(s) in 7m 43s

The good news is that the performance is consistent, it isn't degrading or anything. The other good news is that a huge amount of it could be fixed by improving the test execution or dependencies implied. Pardon the below as we intentionally allow colorized output.

021-04-21T23:27:04.4345820Z 2021-04-21T23:27:04.434142Z	info	running [/Users/runner/work/getenvoy/getenvoy/build/bin/darwin/amd64/getenvoy extension test --toolchain-container-options -v /Users/runner/work/_temp:/tmp/cargohome:delegated -e CARGO_HOME=/tmp/cargohome]
2021-04-21T23:27:07.7948160Z �[0m�[0m�[1m�[32m    Updating�[0m crates.io index
2021-04-21T23:27:30.0309290Z �[0m�[0m�[1m�[32m   Compiling�[0m log v0.4.14
2021-04-21T23:27:59.0575820Z �[0m�[0m�[1m�[32m   Compiling�[0m cfg-if v1.0.0
2021-04-21T23:27:59.9209110Z �[0m�[0m�[1m�[32m   Compiling�[0m ahash v0.4.7
2021-04-21T23:28:02.5245430Z �[0m�[0m�[1m�[32m   Compiling�[0m anyhow v1.0.40
2021-04-21T23:28:42.5995980Z �[0m�[0m�[1m�[32m   Compiling�[0m bitflags v1.2.1
2021-04-21T23:29:19.9386280Z �[0m�[0m�[1m�[32m   Compiling�[0m nanoserde-derive v0.1.16
2021-04-21T23:32:11.3751810Z �[0m�[0m�[1m�[32m   Compiling�[0m hashbrown v0.9.1
2021-04-21T23:32:18.2812840Z �[0m�[0m�[1m�[32m   Compiling�[0m nanoserde v0.1.25
2021-04-21T23:32:30.9243110Z �[0m�[0m�[1m�[32m   Compiling�[0m proxy-wasm-experimental v0.0.8
2021-04-21T23:32:36.9160550Z �[0m�[0m�[1m�[32m   Compiling�[0m envoy-sdk v0.2.0-alpha.1
2021-04-21T23:32:45.4809620Z �[0m�[0m�[1m�[32m   Compiling�[0m envoy-sample-http-filter v0.1.0 (/source)
2021-04-21T23:32:52.4690630Z �[0m�[0m�[1m�[32m   Compiling�[0m envoy-sample-http-filter-module v0.1.0 (/source/wasm/module)
2021-04-21T23:35:36.7062050Z �[0m�[0m�[1m�[32m    Finished�[0m test [unoptimized + debuginfo] target(s) in 8m 30s
2021-04-21T23:35:36.7693430Z �[0m�[0m�[1m�[32m     Running�[0m target/debug/deps/extension-980f149c2a244bee
2021-04-21T23:35:37.1981600Z 
2021-04-21T23:35:37.1982960Z running 1 test
2021-04-21T23:35:37.2805800Z test tests::should_initialize ... �[32mok�(B�[m
2021-04-21T23:35:37.2807370Z 
2021-04-21T23:35:37.2809240Z test result: �[32mok�(B�[m. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.05s

What you'll notice is even though our test is trivial (pasted below) the setup for it is extremely costly in terms of time.

use envoy::extension::{entrypoint, Module, Result};

use envoy_sample_http_filter::SampleHttpFilterFactory;

// Generate the `_start` function that will be called by `Envoy` to let
// WebAssembly module initialize itself.
entrypoint! { initialize }

/// Does one-time initialization.
///
/// Returns a registry of extensions provided by this module.
fn initialize() -> Result<Module> {
    Module::new().add_http_filter(|_instance_id| SampleHttpFilterFactory::default())
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn should_initialize() {
        assert!(initialize().is_ok());
    }
}

While I'm new at rust, I did notice the crate-type is not the same in top-level (rlib) vs the wasm/module (cdylib). When swimming through rust compile optimization blogs I did notice mention that dynamic libs reduce work involved https://doc.rust-lang.org/reference/linkage.html

Hope the breadcrumbs help

@codefromthecrypt
Copy link
Contributor Author

obviated by #200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant