Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] feat(sdk): use workers for inflight code #5547

Closed
wants to merge 1 commit into from

Conversation

Chriscbr
Copy link
Contributor

@Chriscbr Chriscbr commented Jan 25, 2024

Implementing one approach from #4725

Checklist

  • Title matches Winglang's style guide
  • Description explains motivation and solution
  • Tests added (always)
  • Docs updated (only required for features)
  • Added pr/e2e-full label if this feature requires end-to-end testing

By submitting this pull request, I confirm that my contribution is made under the terms of the Wing Cloud Contribution License.

@monadabot
Copy link
Contributor

Thanks for opening this pull request! 🎉
Please consult the contributing guidelines for details on how to contribute to this project.
If you need any assistence, don't hesitate to ping the relevant owner over Slack.

Topic Owner
Wing SDK and utility APIs @chriscbr
Wing Console @ainvoner, @skyrpex, @polamoros
JSON, structs, primitives and collections @hasanaburayyan
Platforms and plugins @hasanaburayyan
Frontend resources (website, react, etc) @tsuf239
Language design @eladb
VSCode extension and language server @markmcculloh
Compiler architecture, inflights, lifting @yoav-steinberg
Wing Testing Framework @tsuf239
Wing CLI @markmcculloh
Build system, dev environment, releases @markmcculloh
Library Ecosystem @chriscbr
Documentation @hasanaburayyan
SDK test suite @tsuf239
Examples @skorfmann
Wing Playground @eladcon

@Chriscbr
Copy link
Contributor Author

Chriscbr commented Jan 26, 2024

Currently one blocker is this:

this.onStop = await this.sandbox.call("handle");

Our API for cloud.Service expects the inflight function to return another inflight function, which means a JavaScript function returning a closure. Function closures can't be serialized between worker threads (and also the API wouldn't hold water in the cloud!) so it probably needs to be reworked.

cc @eladb

@monadabot
Copy link
Contributor

Console preview environment is available at https://wing-console-pr-5547.fly.dev 🚀

Last Updated (UTC) 2024-01-26 00:05

@Chriscbr
Copy link
Contributor Author

Chriscbr commented Jan 26, 2024

My other concern with this approach is it seems to undo a bunch of our work when it comes to stack trace source mapping. For example, with this Wing code:

bring cloud;
bring util;

let foo = inflight () => {
  throw "uh oh!";
};

test "inflight stuff" {
  foo();
}

Running wing test yields:

$ wing test main.w
fail ┌ main.wsim » root/env0/test:inflight stuff
     │ runtime error: uh oh!
     │ at handle ([worker eval]:1158:17)
     │ at $obj ([worker eval]:1153:65)
     │ at handle ([worker eval]:1134:17)
     │ at handler ([worker eval]:1186:25)
     └ at ([worker eval]:1191:20)
 
 
Tests 1 failed (1)
Test Files 1 failed (1)
Duration 0m0.71s

cc @MarkMcCulloh

@monadabot
Copy link
Contributor

Benchmarks

Comparison to Baseline 🟥⬜⬜🟥⬜🟥⬜⬜⬜⬜⬜⬜⬜
Benchmark Before After Change
version 73ms±0.37 76ms±0.84 +3ms (+3.45%)🟥
functions_1.test.w -t sim 594ms±13.45 603ms±7.41 +9ms (+1.53%)⬜
functions_1.test.w -t tf-aws 939ms±10.25 961ms±13.85 +22ms (+2.33%)⬜
jsii_big.test.w -t sim 3334ms±24.52 3409ms±20.28 +75ms (+2.25%)🟥
jsii_big.test.w -t tf-aws 3417ms±13.51 3429ms±21.41 +11ms (+0.33%)⬜
empty.test.w -t sim 549ms±4.54 559ms±3.69 +10ms (+1.74%)🟥
empty.test.w -t tf-aws 663ms±5.29 665ms±3.26 +2ms (+0.32%)⬜
jsii_small.test.w -t sim 561ms±6.48 557ms±5.99 -4ms (-0.72%)⬜
jsii_small.test.w -t tf-aws 665ms±2.87 668ms±4.7 +3ms (+0.49%)⬜
functions_10.test.w -t sim 644ms±4.16 654ms±9.17 +10ms (+1.62%)⬜
functions_10.test.w -t tf-aws 2297ms±8.05 2329ms±18.32 +32ms (+1.39%)⬜
hello_world.test.w -t sim 581ms±4.26 590ms±3.77 +8ms (+1.43%)⬜
hello_world.test.w -t tf-aws 1555ms±11.63 1554ms±6.66 -1ms (-0.06%)⬜

⬜ Within 1.5 standard deviations
🟩 Faster, Above 1.5 standard deviations
🟥 Slower, Above 1.5 standard deviations

Benchmarks may vary outside of normal expectations, especially when running in GitHub Actions CI.

Results
name mean min max moe sd
version 76ms 74ms 78ms 1ms 1ms
functions_1.test.w -t sim 603ms 591ms 623ms 7ms 10ms
functions_1.test.w -t tf-aws 961ms 929ms 993ms 14ms 19ms
jsii_big.test.w -t sim 3409ms 3377ms 3478ms 20ms 28ms
jsii_big.test.w -t tf-aws 3429ms 3389ms 3484ms 21ms 30ms
empty.test.w -t sim 559ms 550ms 566ms 4ms 5ms
empty.test.w -t tf-aws 665ms 656ms 672ms 3ms 5ms
jsii_small.test.w -t sim 557ms 549ms 576ms 6ms 8ms
jsii_small.test.w -t tf-aws 668ms 651ms 674ms 5ms 7ms
functions_10.test.w -t sim 654ms 646ms 690ms 9ms 13ms
functions_10.test.w -t tf-aws 2329ms 2295ms 2381ms 18ms 26ms
hello_world.test.w -t sim 590ms 583ms 602ms 4ms 5ms
hello_world.test.w -t tf-aws 1554ms 1543ms 1567ms 7ms 9ms
Last Updated (UTC) 2024-01-26 00:11

Comment on lines +87 to +90
const worker = new Worker(shim, {
env: this.options.env,
eval: true,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix sourcemaps, a more typical way to do this is to use a preload script instead of an eval'd shim:

const worker = new Worker(this.entrypoint, {
  env: this.options.env,
  // create a local file called sandbox-shim.ts with all the extra stuff needed
  // This file will be executed first, then the entrypoint
  execArgv: ["-r", require.resolve("./sandbox-shim")],
});

Comment on lines +65 to +70
Object.defineProperty(globalThis, "__dirname", {
get: () => { throw new Error("__dirname cannot be used within bundled cloud functions"); },
});
Object.defineProperty(globalThis, "__filename", {
get: () => { throw new Error("__filename cannot be used within bundled cloud functions"); },
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't esbuild already fail if we have any __dirname or __filename?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a major limitation we have today. There's a good chance that existing user code will depend on these.

Any thoughts on how to address this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarkMcCulloh From what I can tell esbuild doesn't reject code with __dirname, it just silently leaves it (leading to possibly wrong runtime behavior).

It looks like it could be possible to support it through an esbuild plugin? evanw/esbuild#859

Copy link
Contributor

@eladb eladb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively I still prefer to spin up a child process or a container per cloud.function and interact with it across the process boundary. Not sure what the benefit of workers are here, eif we use the same container on the cloud.

// it could be better to keep the worker alive and reuse it
// but this requires additional work to make sure logs between invocations
// are not mixed up, and timeouts are handled correctly
const worker = new Worker(shim, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a regression. I think we need to retain and reuse one worker for all invocations in order to simulate lambda host reuse (#5478).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether this is an optimization that holds water. See #5549

@Chriscbr
Copy link
Contributor Author

re: Workers vs separate node process -- I did a naive benchmark (scripts below) and it seemed at least superficially like their performance was roughly the same. So I can't think of any strong reasons against it off the top of my head

// bench-workers.js
const { Worker } = require('worker_threads');

const numThreads = 10; // Number of worker threads to spawn
let threadsStarted = 0;
const start = process.hrtime.bigint();

for (let i = 0; i < numThreads; i++) {
    const worker = new Worker(`
        const { parentPort } = require('worker_threads');
        parentPort.postMessage('Worker thread started');
    `, { eval: true });

    worker.on('message', () => {
        threadsStarted++;
        if (threadsStarted === numThreads) {
            const end = process.hrtime.bigint();
            console.log(`All ${numThreads} worker threads started in ${(end - start) / BigInt(1e6)} ms`);
        }
    });
}

// bench-process.js
const { fork } = require('child_process');

const numProcesses = 10; // Number of child processes to spawn
let processesStarted = 0;
const start = process.hrtime.bigint();

for (let i = 0; i < numProcesses; i++) {
    const child = fork('./childProcess.js');

    child.on('message', (msg) => {
        if (msg === 'Child process started') {
            processesStarted++;
            if (processesStarted === numProcesses) {
                const end = process.hrtime.bigint();
                console.log(`All ${numProcesses} child processes started in ${(end - start) / BigInt(1e6)} ms`);
            }
        }
    });
}

// childProcess.js
process.send('Child process started');

@Chriscbr
Copy link
Contributor Author

Chriscbr commented Feb 6, 2024

Closing in favor of #5554

@Chriscbr Chriscbr closed this Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants